Posted:16 hours ago|
Platform:
On-site
Full Time
Design, develop, and optimize scalable batch data pipelines using Java and Apache Spark to handle large volumes of structured and semi-structured data. Utilize Apache Iceberg to manage data lakehouse environments, supporting advanced features such as schema evolution and time travel for data versioning and auditing. Build and maintain reliable data ingestion and transformation workflows using AWS Glue , EMR , and Lambda services to ensure seamless data flow and integration. Integrate with Snowflake as the cloud data warehouse to enable efficient data storage, querying, and analytics workloads. Collaborate closely with DevOps and infrastructure teams to automate deployment, testing, and monitoring of data workflows using CI/CD tools like Jenkins . Develop and manage CI/CD pipelines for Spark/Java applications, ensuring automated testing and smooth releases in a cloud environment. Monitor and continuously optimize the performance, reliability, and cost-efficiency of data pipelines running on cloud-native platforms. Implement and enforce data security, compliance, and governance policies in line with organizational standards. Troubleshoot and resolve complex issues related to distributed data processing and integration. Work collaboratively within Agile teams to deliver high-quality data engineering solutions aligned with business requirements. Required Skills and Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field. Strong proficiency in Java programming with solid understanding of object-oriented design principles. Proven experience designing and building ETL/ELT pipelines and frameworks. Excellent command of SQL and familiarity with relational database management systems. Hands-on experience with big data technologies such as Apache Spark , Hadoop , and Kafka or equivalent streaming and batch processing frameworks. Knowledge of cloud data platforms, preferably AWS services (Glue, EMR, Lambda) and Snowflake . Experience with data modeling, schema design, and concepts of data warehousing. Understanding of distributed computing, parallel processing, and performance tuning in big data environments. Strong analytical, problem-solving, and debugging skills. Excellent communication and teamwork skills with experience working in Agile environments. Preferred Qualifications: Experience with containerization and orchestration technologies such as Docker and Kubernetes . Familiarity with workflow orchestration tools like Apache Airflow . Basic scripting skills in languages like Python or Bash for automation tasks. Exposure to DevOps best practices and building robust CI/CD pipelines. Prior experience managing data security, governance, and compliance in cloud environments. Next Show more Show less
Manuh Technologies
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Gurugram
20.0 - 35.0 Lacs P.A.
Experience: Not specified
Salary: Not disclosed
Gurugram, Haryana, India
Salary: Not disclosed
Chennai, Tamil Nadu, India
Experience: Not specified
Salary: Not disclosed
Hyderābād
4.0 - 6.1525 Lacs P.A.
Gurugram, Haryana, India
Salary: Not disclosed
Thane, Navi Mumbai, Dombivli
18.0 - 30.0 Lacs P.A.
Gurugram, Haryana, India
Salary: Not disclosed
Mumbai Metropolitan Region
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed