Data Engineer

5 - 9 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

**Job Description** As a Data Engineer at our company, you will be responsible for designing ingestion pipelines, optimizing query performance, and ensuring data quality, governance, and cost efficiency at scale. Your key responsibilities will include: - **Migration Strategy & Execution** - Design and implement data ingestion pipelines to extract data from Oracle into GCS/Iceberg. - Migrate and modernize existing Oracle schemas, partitions, and materialized views into Iceberg tables. - Define Change Data Capture (CDC) strategies using custom ETL. - **Data Lakehouse Architecture** - Configure and optimize Trino clusters (coordinator/worker, Helm charts, autoscaling). - Design partitioning, compaction, and clustering strategies for Iceberg tables. - Implement schema evolution, time-travel, and versioning capabilities. - **Performance & Cost Optimization** - Benchmark Trino query performance vs Oracle workloads. - Tune Trino/Iceberg for large-scale analytical queries, minimizing query latency and storage costs. - **Data Quality, Metadata & Governance** - Integrate Iceberg datasets with metadata/catalog services (Postgre/Hive Metastore, or Glue). - Ensure compliance with governance, observability, and lineage requirements. - Define and enforce standards for unit testing, regression testing, and data validation. - **Collaboration & Delivery** - Support existing reporting workloads (regulatory reporting, DWH) during and after migration. - Document architecture, migration steps, and provide knowledge transfer. **Qualification Required** - **Required Skills & Experience** - **Core Expertise:** - Strong hands-on experience with Trino/Presto, Apache Iceberg, and Oracle SQL/PLSQL. - Proven experience with data lakehouse migrations at scale (50 TB+). - Proficiency in Parquet formats. - **Programming & Tools:** - Solid coding skills in Java, Scala, or Python for ETL/ELT pipeline development. - Experience with orchestration (Spark). - Familiarity with CDC tools, JDBC connectors, or custom ingestion frameworks. - **Cloud & DevOps:** - Strong background in GCP (preferred) or AWS/Azure cloud ecosystems. - Experience with Kubernetes, Docker, Helm charts for deploying Trino workers. - Knowledge of CI/CD pipelines and observability tools. - **Soft Skills:** - Strong problem-solving mindset with the ability to manage dependencies and shifting scopes. - Clear documentation and stakeholder communication skills. - Ability to work in tight delivery timelines with global teams. **About the Company** Regnology is a leading international provider of innovative regulatory, risk, and supervisory technology solutions. Formerly part of BearingPoint group, Regnology now operates independently and serves more than 7,000 financial services firms worldwide. With over 770 employees across 17 office locations, Regnology focuses on regulatory value chain services for financial institutions. Feel free to apply for this exciting opportunity at [Regnology Careers](https://www.regnology.net). **Job Description** As a Data Engineer at our company, you will be responsible for designing ingestion pipelines, optimizing query performance, and ensuring data quality, governance, and cost efficiency at scale. Your key responsibilities will include: - **Migration Strategy & Execution** - Design and implement data ingestion pipelines to extract data from Oracle into GCS/Iceberg. - Migrate and modernize existing Oracle schemas, partitions, and materialized views into Iceberg tables. - Define Change Data Capture (CDC) strategies using custom ETL. - **Data Lakehouse Architecture** - Configure and optimize Trino clusters (coordinator/worker, Helm charts, autoscaling). - Design partitioning, compaction, and clustering strategies for Iceberg tables. - Implement schema evolution, time-travel, and versioning capabilities. - **Performance & Cost Optimization** - Benchmark Trino query performance vs Oracle workloads. - Tune Trino/Iceberg for large-scale analytical queries, minimizing query latency and storage costs. - **Data Quality, Metadata & Governance** - Integrate Iceberg datasets with metadata/catalog services (Postgre/Hive Metastore, or Glue). - Ensure compliance with governance, observability, and lineage requirements. - Define and enforce standards for unit testing, regression testing, and data validation. - **Collaboration & Delivery** - Support existing reporting workloads (regulatory reporting, DWH) during and after migration. - Document architecture, migration steps, and provide knowledge transfer. **Qualification Required** - **Required Skills & Experience** - **Core Expertise:** - Strong hands-on experience with Trino/Presto, Apache Iceberg, and Oracle SQL/PLSQL. - Proven experience with data lakehouse migrations at scale (50 TB+). - Proficiency in Parquet formats. - **Programming & Tools:** - Solid coding skills

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now

RecommendedJobs for You

amaravathi, tenali, mangalagiri

bengaluru, karnataka, india

pune, maharashtra, india

hyderabad, telangana, india