Manager

6 years

0 Lacs

Posted:4 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Position:

Senior MLOps Engineer

Location:

Gurugram

Relevant Experience Required:

6+ years

Employment Type:

Full-time

About The Role

We are seeking a

Senior MLOps Engineer

with deep expertise in

Machine Learning Operations, Data Engineering, and Cloud-Native Deployments

. This role requires building and maintaining

scalable ML pipelines

, ensuring

robust data integration and orchestration

, and enabling

real-time and batch AI systems

in production. The ideal candidate will be skilled in

state-of-the-art MLOps tools

,

data clustering

,

big data frameworks

, and

DevOps best practices

, ensuring high reliability, performance, and security for enterprise AI workloads.

Key Responsibilities

MLOps & Machine Learning Deployment

  • Design, implement, and maintain end-to-end ML pipelines from experimentation to production.
  • Automate model training, evaluation, versioning, deployment, and monitoring using MLOps frameworks.
  • Implement CI/CD pipelines for ML models (GitHub Actions, GitLab CI, Jenkins, ArgoCD).
  • Monitor ML systems in production for drift detection, bias, performance degradation, and anomaly detection.
  • Integrate feature stores (Feast, Tecton, Vertex AI Feature Store) for standardized model inputs.

Data Engineering & Integration

  • Design and implement data ingestion pipelines for structured, semi-structured, and unstructured data.
  • Handle batch and streaming pipelines with Apache Kafka, Apache Spark, Apache Flink, Airflow, or Dagster.
  • Build ETL/ELT pipelines for data preprocessing, cleaning, and transformation.
  • Implement data clustering, partitioning, and sharding strategies for high availability and scalability.
  • Work with data warehouses (Snowflake, BigQuery, Redshift) and data lakes (Delta Lake, Lakehouse architectures).
  • Ensure data lineage, governance, and compliance with modern tools (DataHub, Amundsen, Great Expectations).

Cloud & Infrastructure

  • Deploy ML workloads on AWS, Azure, or GCP using Kubernetes (K8s) and serverless computing (AWS Lambda, GCP Cloud Run).
  • Manage containerized ML environments with Docker, Helm, Kubeflow, MLflow, Metaflow.
  • Optimize for cost, latency, and scalability across distributed environments.
  • Implement infrastructure as code (IaC) with Terraform or Pulumi.

Real-Time ML & Advanced Capabilities

  • Build real-time inference pipelines with low latency using gRPC, Triton Inference Server, or Ray Serve.
  • Work on vector database integrations (Pinecone, Milvus, Weaviate, Chroma) for AI-powered semantic search.
  • Enable retrieval-augmented generation (RAG) pipelines for LLMs.
  • Optimize ML serving with GPU/TPU acceleration and ONNX/TensorRT model optimization.

Security, Monitoring & Observability

  • Implement robust access control, encryption, and compliance with SOC2/GDPR/ISO27001.
  • Monitor system health with Prometheus, Grafana, ELK/EFK, and OpenTelemetry.
  • Ensure zero-downtime deployments with blue-green/canary release strategies.
  • Manage audit trails and explainability for ML models.

Preferred Skills & Qualifications

Core Technical Skills

  • Programming: Python (Pandas, PySpark, FastAPI), SQL, Bash; familiarity with Go or Scala a plus.
  • MLOps Frameworks: MLflow, Kubeflow, Metaflow, TFX, BentoML, DVC.
  • Data Engineering Tools: Apache Spark, Flink, Kafka, Airflow, Dagster, dbt.
  • Databases: PostgreSQL, MySQL, MongoDB, Cassandra, DynamoDB.
  • Vector Databases: Pinecone, Weaviate, Milvus, Chroma.
  • Visualization: Plotly Dash, Superset, Grafana.

Tech Stack

  • Orchestration: Kubernetes, Helm, Argo Workflows, Prefect.
  • Infrastructure as Code: Terraform, Pulumi, Ansible.
  • Cloud Platforms: AWS (SageMaker, S3, EKS), GCP (Vertex AI, BigQuery, GKE), Azure (ML Studio, AKS).
  • Model Optimization: ONNX, TensorRT, Hugging Face Optimum.
  • Streaming & Real-Time ML: Kafka, Flink, Ray, Redis Streams.
  • Monitoring & Logging: Prometheus, Grafana, ELK, OpenTelemetry.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
EXL logo
EXL

Business Process Management / Analytics

New York

RecommendedJobs for You

Gurgaon, Haryana, India

Pune, Maharashtra, India

Mumbai Metropolitan Region

Noida, Uttar Pradesh, India