4.0 years
0.0 Lacs P.A.
Hyderabad, Telangana, India
Posted:1 week ago| Platform:
Remote
Full Time
Job Title: Python Data Engineer – AWS Job location: Remote Job Type: Full-time Client: Direct Description We are seeking a highly skilled Python Data Engineer with deep expertise in AWS-based data solutions. This role is responsible for designing, building, and optimizing large-scale data pipelines and frameworks that power analytics and machine learning workloads. You'll lead the modernization of legacy systems by migrating workloads from platforms like Teradata to AWS-native big data environments such as EMR, Glue, and Redshift. Strong emphasis is placed on reusability, automation, observability, and performance optimization. Key Responsibilities Migration & Modernization: Build reusable accelerators and frameworks to migrate data from legacy platforms (e.g., Teradata) to AWS-native architectures such as EMR and Redshift. Data Pipeline Development: Design and implement robust ETL/ELT pipelines using Python, PySpark, and SQL on AWS big data platforms. Code Quality & Testing: Drive development standards with test-driven development, unit testing, and automated validation of data pipelines. Monitoring & Observability: Build operational tooling and dashboards for pipeline observability, including metrics tracking (latency, throughput, data quality, cost). Cloud-Native Engineering: Architect scalable, secure data workflows using AWS services like Glue, Lambda, Step Functions, S3, and Athena. Collaboration: Partner with internal product teams, data scientists, and external stakeholders to clarify requirements and drive solutions aligned with business goals. Architecture & Integration: Work with enterprise architects to evolve data architecture while integrating AWS systems with on-premise or hybrid environments securely. ML Support & Experimentation: Enable data scientists to operationalize machine learning models by providing clean, well-governed datasets at scale. Documentation & Enablement: Document solutions thoroughly and provide technical guidance and knowledge sharing to internal engineering teams. Qualifications Experience: 4+ years in technology roles, with experience in data engineering, software development, and distributed systems. Programming: Expert in Python and PySpark (Scala is a plus) Deep understanding of software engineering best practices AWS Expertise: 4+ years of hands-on experience in AWS data ecosystem Proficient in AWS Glue, S3, Redshift, EMR, Athena, Step Functions, Lambda Experience with AWS Lake Formation and data cataloging tools is a plus AWS Data Analytics or Solutions Architect certification is a strong plus Big Data & MPP Systems: Strong grasp of distributed data processing Experience with MPP data warehouses like Redshift, Snowflake, or Databricks on AWS DevOps & Tooling: Experience with version control (GitHub/CodeCommit) and CI/CD tools (CodePipeline, Jenkins, etc.) Familiarity with containerization and deployment in Kubernetes or ECS Data Quality & Governance: Experience with data profiling, data lineage, and tools Understanding of metadata management and data security best practices Bonus: Experience supporting machine learning or data science workflows Familiarity with BI tools such as QuickSight, PowerBI, or Tableau Show more Show less
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Hyderabad, Telangana, India
0.0 - 0.0 Lacs P.A.