6 - 12 years
2 - 11 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
Posted:4 days ago|
Platform:
On-site
Full Time
Develop and implement efficient data pipelines using Apache Spark (PySpark preferred) to process and analyze large-scale data. Design, build, and optimize complex SQL queries to extract, transform, and load (ETL) data from multiple sources. Orchestrate data workflows using Apache Airflow , ensuring smooth execution and error-free pipelines. Design, implement, and maintain scalable and cost-effective data storage and processing solutions on AWS using S3, Glue, EMR, and Athena . Leverage AWS Lambda and Step Functions for serverless compute and task orchestration in data pipelines. Work with AWS databases like RDS and DynamoDB to ensure efficient data storage and retrieval. Monitor data processing and pipeline health using AWS CloudWatch and ensure smooth operation in production environments. Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions. Perform performance tuning, optimize distributed data processing tasks, and handle scalability issues. Provide troubleshooting and support for data pipeline failures and ensure high availability and reliability. Contribute to the setup and maintenance of CI/CD pipelines for automated deployment and testing of data workflows. Required Skills & Experience : Experience: Minimum of 6+ years of hands-on experience in data engineering or big data development roles, with a focus on designing and building data pipelines and processing systems. Technical Skills: Strong programming skills in Python with hands-on experience in Apache Spark (PySpark preferred). Proficient in writing and optimizing complex SQL queries for data extraction, transformation, and loading. Hands-on experience with Apache Airflow for orchestration of data workflows and pipeline management. In-depth understanding and practical experience with AWS services : Data Storage & Processing: S3, Glue, EMR, Athena Compute & Execution: Lambda, Step Functions Databases: RDS, DynamoDB Monitoring: CloudWatch Experience with distributed data processing, parallel computing, and performance tuning techniques. Strong analytical and problem-solving skills to troubleshoot and optimize data workflows and pipelines. Familiarity with CI/CD pipelines and DevOps practices for continuous integration and automated deployments is a plus. Preferred Qualifications: Familiarity with other cloud platforms (Azure, Google Cloud) and services related to data engineering. Experience in handling unstructured and semi-structured data and working with data lakes. Knowledge of containerization technologies such as Docker or orchestration systems like Kubernetes . Experience with NoSQL databases or data warehouses like Redshift or BigQuery is a plus. Qualifications: Education: Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field. Experience: Minimum of 6+ years in a data engineering role with strong expertise in AWS and big data processing frameworks.
Virtusa
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Andhra Pradesh
Salary: Not disclosed
Hyderabad / Secunderabad, Telangana, Telangana, India
2.5 - 11.0 Lacs P.A.
Andhra Pradesh, India
Salary: Not disclosed
Bahadurgarh
2.5 - 4.0 Lacs P.A.
45.0 - 50.0 Lacs P.A.
15.0 - 19.0 Lacs P.A.
13.0 - 17.0 Lacs P.A.
Gurugram
10.0 - 14.0 Lacs P.A.
Gurugram
10.0 - 11.0 Lacs P.A.
Bengaluru
50.0 - 60.0 Lacs P.A.