This role is for one of the Weekday's clients
Min Experience: 7 yearsLocation: NOIDA, Gurugram, DelhiJobType: full-timeWe are seeking an experienced and highly skilled
Data Architect
to design, develop, and optimize scalable data architectures that support business analytics, data governance, and enterprise reporting. The ideal candidate will have strong hands-on technical expertise in Python and PySpark, deep understanding of modern data platforms, and proven experience building robust, secure, and high-performing data ecosystems. This role requires a strategic thinker who can translate complex business requirements into effective data models and scalable systems, while ensuring data quality, accessibility, and reliability across the organization.
Requirements
Key Responsibilities
- Lead the design and implementation of enterprise-level data architecture, including data pipelines, data models, and analytical frameworks.
- Build and optimize data ingestion, transformation, and processing pipelines using PySpark, Spark SQL, and distributed computing technologies.
- Architect and maintain data lakes, data warehouses, and streaming-based data solutions, ensuring efficiency, performance, and scalability.
- Collaborate with cross-functional teams including data engineering, analytics, cloud infrastructure, and business stakeholders to define data roadmaps, standards, and governance practices.
- Establish data modeling standards and best practices for structured, unstructured, and semi-structured data.
- Drive data quality, metadata management, and master data management efforts to ensure data consistency and integrity.
- Evaluate, recommend, and integrate new data management tools, platforms, and technologies aligned with industry best practices.
- Ensure security, compliance, and privacy requirements are embedded into data solutions, supporting regulatory frameworks and audit processes.
- Mentor and guide data engineers and analysts on architectural standards, performance tuning, and advanced data processing techniques.
- Conduct performance tuning, optimization, and troubleshooting of large-scale distributed data systems and pipelines.
Required Skills & Experience
- 7-13 years of experience in data architecture, data engineering, or related fields.
- Strong programming skills in Python and expert-level proficiency in PySpark / Apache Spark.
- Proven experience designing and implementing large-scale data pipelines and distributed data processing systems.
- Hands-on experience with big data ecosystems (e.g., Hadoop, Spark, Hive, HDFS, Kafka).
- Experience with cloud data platforms such as AWS, Azure, or GCP (e.g., Redshift, Snowflake, Databricks, BigQuery).
- Strong understanding of ETL/ELT frameworks, workflow management tools, and orchestration systems (Airflow, DBT, etc.).
- Deep knowledge of data modeling techniques including relational, dimensional, and NoSQL model designs.
- Familiarity with data governance, data quality frameworks, and metadata management.
- Excellent problem-solving, analytical thinking, and communication skills with the ability to simplify complex data challenges.
- Ability to collaborate effectively with engineering, analytics, and business stakeholders.
Preferred Qualifications
- Experience working in product-led or high-growth technology environments.
- Knowledge of machine learning data preparation and real-time analytics.
- Exposure to API-based data consumption and microservices architecture