AWS Data Engineer with Terraform (Immediate Opening) Mumbai 7 - 12 years INR 25.0 - 30.0 Lacs P.A. Remote Full Time

PLEASE APPLY IF YOU CAN JOIN IMMEDIATELY AND HAVE 7+ YRS AWS DATA ENGINEER EXPERIENCE WITH TERRAFORM AND GIT Job Description: We are seeking a skilled Data Engineer with 7+ years of experience in data processing, ETL pipelines, and cloud-based data solutions. The ideal candidate will have strong expertise in AWS Glue, Redshift, S3, EMR, and Lambda , with hands-on experience using Python and PySpark for large-scale data transformations. The candidate will be responsible for designing, building, and maintaining scalable data pipelines and systems to support analytics and data-driven decision-making. Additionally, need to have strong expertise in Terraform and Git-based CI/CD pipelines to support infrastructure automation and configuration management. Key Responsibilities: ETL Development & Automation: Design and implement ETL pipelines using AWS Glue and PySpark to transform raw data into consumable formats. Automate data processing workflows using AWS Lambda and Step Functions. Data Integration & Storage: Integrate and ingest data from various sources into Amazon S3 and Redshift. Optimize Redshift for query performance and cost efficiency. Data Processing & Analytics: Use AWS EMR and PySpark for large-scale data processing and complex transformations. Build and manage data lakes on Amazon S3 for analytics use cases. Monitoring & Optimization: Monitor and troubleshoot data pipelines to ensure high availability and performance. Implement best practices for cost optimization and performance tuning in Redshift, Glue, and EMR. Terraform & Git-based Workflows: Design and implement Terraform modules to provision cloud infrastructure across AWS/Azure/GCP. Manage and optimize CI/CD pipelines using Git-based workflows (e.g., GitHub Actions, GitLab CI, Jenkins, Azure DevOps). Collaborate with developers and cloud architects to automate infrastructure provisioning and deployments. Write reusable and scalable Terraform modules following best practices and code quality standards. Maintain version control, branching strategies, and code promotion processes in Git. Collaboration: Work closely with stakeholders to understand requirements and deliver solutions. Document data workflows, designs, and processes for future reference. Must-Have Skills: Strong proficiency in Python and PySpark for data engineering tasks. Hands-on experience with AWS Glue, Redshift, S3, and EMR . Expertise in building, deploying, and optimizing data pipelines and workflows. Solid understanding of SQL and databas optimization techniques. Strong hands-on experience with Terraform , including writing and managing modules, state files, and workspaces. Proficient in CI/CD pipeline design and maintenance using tools like: GitHub Actions / GitLab CI / Jenkins / Azure DevOps Pipelines Deep understanding of Git workflows (e.g., GitFlow, trunk-based development). Experience in serverless architecture using AWS Lambda for automation and orchestration. Knowledge of data modeling, partitioning, and schema design for data lakes and warehouses. Ability to work 8 PM IST to 4 AM IST (night shift in order to align with customers business hours)

AWS Data Engineer with SQL (Immediate Opening) mumbai 7 - 12 years INR 25.0 - 30.0 Lacs P.A. Remote Full Time

PLEASE APPLY IF YOU CAN JOIN IMMEDIATELY AND HAVE 7+ YRS AWS DATA ENGINEER EXPERIENCE WITH SQL AND GIT Job Description: We are seeking a skilled Data Engineer with 7+ years of experience in data processing, ETL pipelines, and cloud-based data solutions. The ideal candidate will have strong expertise in AWS Glue, Redshift, S3, EMR, and Lambda , SQL, Stored Procedures with hands-on experience using Python and PySpark for large-scale data transformations. The candidate will be responsible for designing, building, and maintaining scalable data pipelines and systems to support analytics and data-driven decision-making. Additionally, need to have strong expertise in Terraform and Git-based CI/CD pipelines to support infrastructure automation and configuration management. Key Responsibilities: ETL Development & Automation: Design and implement ETL pipelines using AWS Glue and PySpark to transform raw data into consumable formats. Automate data processing workflows using AWS Lambda and Step Functions. Data Integration & Storage: Integrate and ingest data from various sources into Amazon S3 and Redshift. Optimize Redshift for query performance and cost efficiency. Data Processing & Analytics: Use AWS EMR and PySpark for large-scale data processing and complex transformations. Build and manage data lakes on Amazon S3 for analytics use cases. Monitoring & Optimization: Monitor and troubleshoot data pipelines to ensure high availability and performance. Implement best practices for cost optimization and performance tuning in Redshift, Glue, and EMR. Terraform & Git-based Workflows: Design and implement Terraform modules to provision cloud infrastructure across AWS/Azure/GCP. Manage and optimize CI/CD pipelines using Git-based workflows (e.g., GitHub Actions, GitLab CI, Jenkins, Azure DevOps). Collaborate with developers and cloud architects to automate infrastructure provisioning and deployments. Write reusable and scalable Terraform modules following best practices and code quality standards. Maintain version control, branching strategies, and code promotion processes in Git. Collaboration: Work closely with stakeholders to understand requirements and deliver solutions. Document data workflows, designs, and processes for future reference. Must-Have Skills: Strong proficiency in SQL, Stored Procs, Python and PySpark for data engineering tasks. Hands-on experience with AWS Glue, Redshift, S3, and EMR . Expertise in building, deploying, and optimizing data pipelines and workflows. Solid understanding of database optimization techniques. Strong hands-on experience with Terraform , including writing and managing modules, state files, and workspaces. Proficient in CI/CD pipeline design and maintenance using tools like: GitHub Actions / GitLab CI / Jenkins / Azure DevOps Pipelines Deep understanding of Git workflows (e.g., GitFlow, trunk-based development). Experience in serverless architecture using AWS Lambda for automation and orchestration. Knowledge of data modeling, partitioning, and schema design for data lakes and warehouses. Ability to work 5 PM IST to 1 AM IST (evening/night shift in order to align with customers business hours)

Login to

Please Verify Your Phone or Email

Confirm Action

Guddge Tech

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

Guddge Tech