Posted:1 month ago|
Platform:
Work from Office
Full Time
A Data Engineer specializing in enterprise data platforms, experienced in building, managing, and optimizing data pipelines for large-scale environments. Having expertise in big data technologies, distributed computing, data ingestion, and transformation frameworks. Proficient in Apache Spark, PySpark, Kafka, and Iceberg tables, and understand how to design and implement scalable, high-performance data processing solutions.What you’ll doAs a Data Engineer – Data Platform Services, responsibilities include: Data Ingestion & Processing Designing and developing data pipelines to migrate workloads from IIAS to Cloudera Data Lake. Implementing streaming and batch data ingestion frameworks using Kafka, Apache Spark (PySpark). Working with IBM CDC and Universal Data Mover to manage data replication and movement. Big Data & Data Lakehouse Management Implementing Apache Iceberg tables for efficient data storage and retrieval. Managing distributed data processing with Cloudera Data Platform (CDP). Ensuring data lineage, cataloging, and governance for compliance with Bank/regulatory policies. Optimization & Performance Tuning Optimizing Spark and PySpark jobs for performance and scalability. Implementing data partitioning, indexing, and caching to enhance query performance. Monitoring and troubleshooting pipeline failures and performance bottlenecks. Security & Compliance Ensuring secure data access, encryption, and masking using Thales CipherTrust. Implementing role-based access controls (RBAC) and data governance policies. Supporting metadata management and data quality initiatives. Collaboration & Automation Working closely with Data Scientists, Analysts, and DevOps teams to integrate data solutions. Automating data workflows using Airflow and implementing CI/CD pipelines with GitLab and Sonatype Nexus. Supporting Denodo-based data virtualization for seamless data access. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 6-10 years of experience in big data engineering, data processing, and distributed computing. Proficiency in Apache Spark, PySpark, Kafka, Iceberg, and Cloudera Data Platform (CDP). Strong programming skills in Python, Scala, and SQL. Experience with data pipeline orchestration tools (Apache Airflow, Stonebranch UDM). Knowledge of data security, encryption, and compliance frameworks. Experience working with metadata management and data quality solutions. Preferred technical and professional experience Experience with data migration projects in the banking/financial sector. Knowledge of graph databases (DGraph Enterprise) and data virtualization (Denodo). Exposure to cloud-based data platforms (AWS, Azure, GCP). Familiarity with MLOps integration for AI-driven data processing. Certifications in Cloudera Data Engineering, IBM Data Engineering, or AWS Data Analytics. Architectural review and recommendations on the migration/transformation solutions. Experience working with Banking Data model. “Meghdoot” Cloud platform knowledge.
IBM
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections IBM
Hyderabad, Telangana, India
Salary: Not disclosed
Hyderabad, Telangana, India
Experience: Not specified
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
Mumbai, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Mumbai, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Mumbai, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Mumbai, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Pune, Maharashtra, India
Experience: Not specified
Salary: Not disclosed
Mumbai
14.0 - 17.0 Lacs P.A.
Navi Mumbai
14.0 - 17.0 Lacs P.A.