Role-Technology
Mandatory Skills
Notice Period
Experience Range (years)
Location
Budget
Lead Data EngineerAWS, Python, PysparkImmediate to 15 days7+ YrsKerala Local candidates only. (Kochi/TVM)23 LPA
Job Overview
We are seeking an experienced Senior Data Engineer to lead the development of a scalable dataingestion framework while ensuring high data quality and validation. The successful candidatewill also be responsible for designing and implementing robust APIs for seamless dataintegration. This role is ideal for someone with deep expertise in building and managing big datapipelines using modern AWS-based technologies, and who is passionate about driving qualityand efficiency in data processing systems.Key Responsibilities
- Data Ingestion Framework:
- Design & Development: Architect, develop, and maintain an end-to-end data
ingestion framework that efficiently extracts, transforms, and loads data from
diverse sources.
- Framework Optimization: Use AWS services such as AWS Glue, Lambda,
EMR, ECS , EC2 and Step Functions to build highly scalable, resilient, and
automated data pipelines.
- Data Quality & Validation:
- Validation Processes: Develop and implement automated data quality checks,
validation routines, and error-handling mechanisms to ensure the accuracy and
integrity of incoming data.
- Monitoring & Reporting: Establish comprehensive monitoring, logging, and
alerting systems to proactively identify and resolve data quality issues.
- API Development:
- Design & Implementation: Architect and develop secure, high-performance
APIs to enable seamless integration of data services with external applications
and internal systems.
- Documentation & Best Practices: Create thorough API documentation and
establish standards for API security, versioning, and performance optimization.
- Collaboration & Agile Practices:
- Cross-Functional Communication: Work closely with business stakeholders,
data scientists, and operations teams to understand requirements and translate
them into technical solutions.
- Agile Development: Participate in sprint planning, code reviews, and agile
ceremonies, while contributing to continuous improvement initiatives and CI/CDpipeline development (using tools like GitLab).
Required Qualifications
- Experience & Technical Skills:
- Professional Background: At least 5 years of relevant experience in data
engineering with a strong emphasis on analytical platform development.
- Programming Skills: Proficiency in Python and/or PySpark, SQL for
developing ETL processes and handling large-scale data manipulation.
- AWS Expertise: Extensive experience using AWS services including AWS Glue,
Lambda, Step Functions, and S3 to build and manage data ingestion frameworks.
- Data Platforms: Familiarity with big data systems (e.g., AWS EMR, Apache
Spark, Apache Iceberg) and databases like DynamoDB, Aurora, Postgres, or
Redshift.
- API Development: Proven experience in designing and implementing RESTful
APIs and integrating them with external and internal systems.
- CI/CD & Agile: Hands-on experience with CI/CD pipelines (preferably with
GitLab) and Agile development methodologies.
- Soft Skills:
- Strong problem-solving abilities and attention to detail.
- Excellent communication and interpersonal skills with the ability to work
independently and collaboratively.
- Capacity to quickly learn and adapt to new technologies and evolving business
requirements.
Preferred Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- Experience with additional AWS services such as Kinesis, Firehose, and SQS.
- Familiarity with data lakehouse architectures and modern data quality frameworks.
- Prior experience in a role that required proactive data quality management and API-
driven integrations in complex, multi-cluster environments.
- To adhere to the Information Security Management policies and procedures