Team Lead - Data Group

7 - 10 years

25 - 32 Lacs

Posted:1 hour ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Position Overview:
We seek a highly skilled and experienced Data Engineering Lead to join our team. This role demands deep technical expertise in Apache Spark, Hive, Trino (formerly Presto), Python, AWS Glue, and the broader AWS ecosystem. The ideal candidate will possess strong hands-on skills and the ability to design and implement scalable data solutions, optimise performance, and lead a high-performing team to deliver data-driven insights.Key Responsibilities:Technical LeadershipLead and mentor a team of data engineers, fostering best practices in coding, design, and delivery.Drive the adoption of modern data engineering frameworks, tools, and methodologies to ensure high-quality and scalable solutions.Translate complex business requirements into effective data pipelines, architectures, and workflows.Data Pipeline DevelopmentArchitect, develop, and optimize scalable ETL/ELT pipelines using Apache Spark, Hive, AWS Glue, and Trino.Handle complex data workflows across structured and unstructured data sources, ensuring performance and cost-efficiency.Develop real-time and batch processing systems to support business intelligence, analytics, and machine learning applications.Cloud & Infrastructure ManagementBuild and maintain cloud-based data solutions using AWS services like S3, Athena, Redshift, EMR, DynamoDB, and Lambda.Design and implement federated query capabilities using Trino for diverse data sources.Manage Hive Metastore for schema and metadata management in data lakes.Performance OptimizationOptimize Apache Spark jobs and Hive queries for performance, ensuring efficient resource utilization and minimal latency.Implement caching and indexing strategies to accelerate query performance in Trino.Continuously monitor and improve system performance through diagnostics and tuning.Collaboration & Stakeholder EngagementWork closely with data scientists, analysts, and business teams to understand requirements and deliver actionable insights.Ensure that data infrastructure aligns with organizational goals and compliance standards.Data Governance & QualityEstablish and enforce data quality standards, governance practices, and monitoring processes.Ensure data security, privacy, and compliance with regulatory frameworks.Innovation & Continuous LearningStay ahead of industry trends, emerging technologies, and best practices in data engineering.Proactively identify and implement improvements in data architecture and processes.

Qualifications:
Required Technical ExpertiseAdvanced proficiency with Apache Spark (core, SQL, streaming) for large-scale data processing.Strong expertise in Hive for querying and managing structured data in data lakes.In-depth knowledge of Trino (Presto) for federated querying and high-performance SQL execution.Solid programming skills in Python with frameworks like PySpark and Pandas.Hands-on experience with AWS Glue, including Glue ETL jobs, Glue Data Catalog, and Glue Crawlers.Deep understanding of data formats such as Parquet, ORC, Avro, and their use cases.Cloud ProficiencyExpertise in AWS services, including S3, Redshift, Athena, EMR, DynamoDB, and IAM.Experience designing scalable and cost-efficient cloud-based data solutions.Performance TuningStrong ability to optimize Apache Spark jobs, Hive queries, and Trino workloads for distributed environments.Experience with advanced techniques like partitioning, bucketing, and query plan optimization.Leadership & CollaborationProven experience leading and mentoring data engineering teams.Strong communication skills, with the ability to interact with technical and non-technical stakeholders effectively.

Education & Experience
Bachelors or Masters degree in Computer Science, Data Engineering, or a related field.8+ years of experience in data engineering with a minimum of 2 years in a leadership role.Qualifications:8+ years of experience in building data pipelines from scratch in large data volume environmentsAWS certifications, such as AWS Certified Data Analytics or AWS Certified Solutions Architect.Experience with Kafka or Kinesis for real-time data streaming would be a plus.Familiarity with containerization tools like Docker and orchestration platforms like Kubernetes.Knowledge of CI/CD pipelines and DevOps practices for data engineering.Prior experience with data lake architectures and integrating ML workflows.

Mandatory Key SkillsCI/CD,DevOps,data engineering,Apache Spark jobs,Hive queries,Performance Tuning,AWS Glue,Data Governance,AWS*,Spark*,Python*,Hive*,ETL*

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Apex One logo
Apex One

Technology Solutions

Tech City

RecommendedJobs for You