Job
Description
Job Description Lead Data Engineer Position: Lead Data Engineer Location: Hyderabad (Work from Office Mandatory) Experience: 10+ years overall | 8+ years relevant in Data Engineering Notice Period: Immediate to 30 days. About the Role We are looking for a strategic and hands-on Lead Data Engineer to architect and lead cutting-edge data platforms that empower business intelligence, analytics, and AI initiatives. This role demands a deep understanding of cloud-based big data ecosystems, excellent leadership skills, and a strong inclination toward driving data quality and governance at scale. You will define the data engineering roadmap, architect scalable data systems, and lead a team responsible for building and optimizing pipelines across structured and unstructured datasets in a secure and compliant environment. Key Responsibilities 1. Technical Strategy & Architecture Define the vision and technical roadmap for enterprise-grade data platforms (Lakehouse, Warehouse, Real-Time Pipelines). Lead evaluation of data platforms and tools, making informed build vs. buy decisions. Design solutions for long-term scalability, cost-efficiency, and performance. 2. Team Leadership Mentor and lead a high-performing data engineering team. Conduct performance reviews, technical coaching, and participate in hiring/onboarding. Instill engineering best practices and a culture of continuous improvement. 3. Platform & Pipeline Engineering Build and maintain data lakes, warehouses, and lakehouses using AWS, Azure, GCP, or Databricks. Architect and optimize data models and schemas tailored for analytics/reporting. Manage large-scale ETL/ELT pipelines for batch and streaming use cases. 4. Data Quality, Governance & Security Enforce data quality controls: automated validation, lineage, anomaly detection. Ensure compliance with data privacy and governance frameworks (GDPR, HIPAA, etc.). Manage metadata and documentation for transparency and discoverability. 5. Cross-Functional Collaboration Partner with Data Scientists, Product Managers, and Business Teams to understand requirements. Translate business needs into scalable data workflows and delivery mechanisms. Support self-service analytics and democratization of data access. 6. Monitoring, Optimization & Troubleshooting Implement monitoring frameworks to ensure data reliability and latency SLAs. Proactively resolve bottlenecks, failures, and optimize system performance. Recommend platform upgrades and automation strategies. 7. Technical Leadership & Community Building Lead code reviews, define development standards, and share reusable components. Promote innovation, experimentation, and cross-team knowledge sharing. Encourage open-source contributions and thought leadership. Required Skills & Experience 10+ years of experience in data engineering or related domains. Expert in PySpark, Python, and SQL . Deep expertise in Apache Spark and other distributed processing frameworks. Hands-on experience with cloud platforms (AWS, Azure, or GCP) and services like S3, EMR, Glue, Databricks, Data Factory . Proficient in data warehouse solutions (e.g., Snowflake, Redshift, BigQuery) and RDBMS like PostgreSQL or SQL Server. Knowledge of orchestration tools (Airflow, Dagster, or cloud-native schedulers). Familiarity with CI/CD tools , Git, and Infrastructure as Code (Terraform, CloudFormation). Strong data modeling and lifecycle management understanding.