Data Engineer I [T500-21404]

25 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About ADM:

We are one of the world’s largest nutrition companies and a global leader in human and animal nutrition. We unlock the power of nature to provide nourishing quality of life by transforming crops into ingredients and solutions for foods, beverages, supplements, livestock, aquaculture, and pets.

About ADM India Hub:

At ADM, we have long recognized the strength and potential of India’s talent pool, which is why we have maintained a presence in the country for more than 25 years. Building on this foundation, we have now established ADM India Hub, our first GCC in India.

At ADM India Hub, we are hiring for IT and finance roles across diverse technology and business functions. We stand at the intersection of global expertise and local excellence, enabling us to drive innovation and support our larger purpose of unlocking the power of nature to enrich quality of life.


Job Title: Data Engineer I

Job Overview:

We are seeking an experienced and highly motivated Data Ingestion Engineer to join our dynamic team. The ideal candidate will have strong hands-on experience with Azure Data Factory (ADF), a deep understanding of relational and non-relational data ingestion techniques, and proficiency in Python programming. You will be responsible for designing and implementing scalable data ingestion solutions that interface with Azure Data Lake Storage Gen 2 (ADLS Gen 2), Databricks, and various other Azure ecosystem services.


Key Responsibilities:

Data Ingestion Strategy & Development:

  • Design, develop, and deploy scalable and efficient data pipelines in Azure Data Factory (ADF) to move data from multiple sources (relational, non-relational, files, APIs, etc.) into

    Azure Data Lake Storage

    Gen 2 (ADLS Gen 2), Azure SQL Database, and other target systems.
  • Implement ADF activities (copy, lookup, execute pipeline, etc.) to integrate data from on-premises and cloud-based systems.
  • Build parameterized and reusable pipeline templates in ADF to standardize the data ingestion process, ensuring maintainability and scalability of ingestion workflows.
  • Integrate custom data transformation activities within ADF pipelines, utilizing Python, Databricks, or Azure Functions when required


ADF Data Flows Design & Development:

  • Leverage Azure Data Factory Data Flows for visually designing and orchestrating data transformation tasks, enabling complex ETL (Extract, Transform, Load) logic to process large datasets at scale.
  • Design data flow transformations such as filtering, aggregation, joins, lookups, and sorting to process and transform data before loading it into target systems like ADLS Gen 2 or Azure SQL Database.
  • Implement incremental loading strategies in Data Flows to ensure efficient and optimized data ingestion for large volumes of data while minimizing resource consumption.
  • Develop reusable data flow components to streamline transformation processes, ensuring consistency and reducing development time for new data ingestion pipelines.
  • Utilize debugging tools in Data Flows to troubleshoot, test, and optimize data transformations, ensuring accurate results and performance.


ADF Orchestration & Automation:

  • Use ADF triggers and scheduling to automate pipeline execution based on time or events, ensuring timely and efficient data ingestion.
  • Configure ADF monitoring and alerting capabilities to proactively track pipeline performance, handle failures, and address issues in a timely manner.
  • Implement ADF version control practices using Git to manage code changes, collaborate effectively with other team members, and ensure code integrity.


Data Integration with Various Sources:

  • Ingest data from diverse sources such as on-premises SQL Servers, REST APIs, cloud databases (e.g., Azure SQL Database, Cosmos DB), file-based systems (CSV, Parquet, JSON), and third-party services using ADF.
  • Design and implement ADF linked services to securely connect to external data sources (databases, file systems, APIs, etc.).
  • Develop and configure ADF datasets and dataflows to efficiently transform, clean, and load data into Azure Data Lake or other destinations.


Pipeline Monitoring and Optimization:

  • Continuously monitor and optimize ADF pipelines to ensure they run with high performance and minimal cost. Apply techniques like data partitioning, parallel processing, and incremental loading where appropriate.
  • Implement data quality checks within the pipelines to ensure data integrity and handle data anomalies or errors in a systematic manner.
  • Review pipeline execution logs and performance metrics regularly and apply tuning recommendations to improve execution times and reduce operational costs.


Collaboration and Communication:

  • Work closely with business and technical stakeholders to capture and translate data ingestion requirements into ADF pipeline designs.
  • Provide ADF-specific technical expertise to both internal and external teams, guiding them in the use of ADF for efficient and cost-effective data pipelines.
  • Document ADF pipeline designs, error handling strategies, and best practices to ensure the team can maintain and scale the solutions.
  • Conduct training sessions or knowledge transfer with junior engineers or other team members on ADF best practices and architecture.


Security and Compliance:

  • Ensure all data ingestion solutions built in ADF follow security and compliance guidelines, including encryption at rest and in transit, data masking, and identity and access management.
  • Implement role-based access control (RBAC) and managed identities within ADF to manage access securely and reduce the risk of unauthorized access to sensitive data.


Integration with Azure Ecosystem:

  • Leverage other Azure services, such as Azure Logic Apps, Azure Function Apps, and Azure Databricks, to augment the capabilities of ADF pipelines, enabling more advanced data processing, event-driven workflows, and custom transformations.
  • Incorporate Azure Key Vault to securely store and manage sensitive data (e.g., connection strings, credentials) used in ADF pipelines.
  • Integrate ADF with Azure Data Lake Analytics, Synapse Analytics, or other data warehousing solutions for advanced querying and analytics after ingestion.


Best Practices & Continuous Improvement:

  • Develop and enforce best practices for building and maintaining ADF pipelines and data flows, ensuring the solutions are modular, reusable, and follow coding standards.
  • Identify opportunities for pipeline

    automation

    to reduce manual intervention and improve operational efficiency.
  • Regularly review and suggest

    new tools or services

    within the Azure ecosystem to enhance ADF pipeline performance and increase the overall efficiency of data ingestion workflows.


Required Skills and Qualifications:

Experience with Azure Data Services:

  • Strong experience with Azure Data Factory (ADF) for orchestrating data pipelines.
  • Hands-on experience with ADLS Gen 2, Databricks, and various data formats (e.g., Parquet, JSON, CSV).
  • Solid understanding of Azure SQL Database, Azure Logic Apps, Azure Function Apps, and Azure Container Apps


Programming and Scripting:

  • Proficient in Python/ PySpark for data ingestion, automation, and transformation tasks.
  • Ability to write clean, reusable, and maintainable code.
  • API


Data Ingestion Techniques:

  • Solid understanding of relational and non-relational data models and their ingestion techniques.
  • Experience working with file-based data ingestion, API-based data ingestion, and integrating data from various third-party systems.


Problem Solving & Analytical Skills:

  • Strong troubleshooting and debugging skills to quickly resolve ingestion-related issues.
  • Ability to analyze and optimize complex data workflows for performance and efficiency.


Communication Skills:

  • Excellent communication skills to articulate complex technical solutions to non-technical stakeholders.
  • Ability to document processes and solutions clearly and concisely.


Work Ethic & Responsibility:

  • Demonstrated ability to take full responsibility for tasks, from requirements gathering to delivery.
  • Comfortable working independently and collaborating in a team environment.


Preferred Qualifications:

  • Bachelor’s degree in computer science, Engineering, or related field.
  • 2 Year of relevant experience
  • Experience with version control systems like Git.
  • Experience in other Azure services such as Azure Synapse Analytics and Azure Data Share.
  • Familiarity with cloud security best practices and data privacy regulations.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
ADM logo
ADM

Agriculture & Food Processing

Decatur

RecommendedJobs for You

bengaluru, karnataka, india

bengaluru, karnataka, india