Data Engineer-AI Labs

3 - 7 years

1 - 2 Lacs

Posted:9 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Purpose/Objective


    The Data Engineer will be responsible for designing, developing, and maintaining scalable data pipelines and infrastructure to support AI applications. The role involves implementing efficient data integration, processing, and transformation solutions using Python, PySpark, and cloud-based data engineering tools (Azure, GCP). The engineer will work closely with AI, ML, and DevOps teams to enable seamless data flow for AI/ML model training, deployment, and operations (MLOps), ensuring optimized data architecture, security, and compliance.

Key Responsibilities of Role


    Data Engineer-AI Labs Data Pipeline Development & Optimization: Ensure efficient data processing by designing, implementing, and optimizing ETL ELT data pipelines for AI and machine learning workloads. Enhance data flow and transformation using Python, PySpark, and cloud-based data engineering tools (Azure Data Factory, Google Dataflow, Databricks). Improve data ingestion and integration by leveraging Kafka, Pub Sub, and other messaging queues for real-time and batch processing. Ensure scalability and performance by implementing distributed computing frameworks and optimizing data storage architectures. Cloud Data Engineering & AI Model Enablement: Improve AI data readiness by designing data lakes, data warehouses, and real-time streaming architectures on Azure and GCP. Optimize AI model performance by structuring, cleaning, and transforming data to meet ML model training and inferencing needs. Ensure data accessibility by implementing data governance, security policies, and access controls for AI teams. Reduce AI model training time by optimizing big data storage and processing strategies. MLOps & AI Model Deployment Support: Enable AI model lifecycle automation by implementing CI CD pipelines for ML model deployment using MLOps best practices. Ensure seamless AI model serving by integrating Docker, Kubernetes, and cloud-based AI services. Improve AI ML data versioning by using MLflow, DVC, or similar tools for data tracking and experiment logging. Enhance AI observability by setting up real-time monitoring, logging, and alerting for AI ML data pipelines. Data Security, Compliance & Governance: Ensure compliance with data privacy regulations (e.g., GDPR, HIPAA) by implementing data encryption, masking, and anonymization techniques. Strengthen data security by enforcing role-based access control (RBAC) and identity & access management (IAM) policies. Ensure data integrity by implementing data validation, schema enforcement, and audit logging mechanisms. Cross-functional Collaboration & Continuous Improvement: Collaborate with AI, DevOps, and business teams to align data infrastructure with AI and analytics needs. Drive innovation in data engineering by evaluating and adopting emerging cloud, AI, and big data technologies. Optimize data engineering efficiency by identifying and implementing best practices for automation, cost reduction, and performance tuning. Key Stakeholders - Internal AI & Data Science Teams DevOps & Cloud Teams Business Intelligence & Analytics Teams IT Security & Compliance Teams Key Stakeholders - External Cloud & Data Service Providers Third-party AI Model Vendors Regulatory Bodies & Compliance Authorities

Technical Competencies


    AI, Machine Learning & Data Engineering-SDIL,Cloud & Edge Platform Infrastructure Management-SDIL,Database Management & Optimization-SDIL,ETL, Data Processing & Real-Time Analytics-SDIL

Qualifications and Experience


    Educational Qualification: Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or related fields. Certification: Microsoft Azure Data Engineer Associate Google Professional Data Engineer AWS Certified Data Analytics – Specialty. Big Data & Apache Spark Certification (Cloudera, Databricks, Coursera, Udemy). Certified Kubernetes Administrator (CKA) for data pipeline orchestration. Work Experience (Range of years): 1-10 years of experience in data engineering, cloud data platforms, and AI ML data management. Expertise in data pipeline development, ETL ELT processes, and cloud-based big data solutions. Hands-on experience with Python, PySpark, SQL, and cloud-native data services. Experience with AI ML deployment, MLOps, and real-time data streaming architectures.

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Adani Group logo
Adani Group

Conglomerate

Ahmedabad

RecommendedJobs for You