Technical Lead-ML Development

3 - 5 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Area(s) of responsibility

What You’ll Do

  • Develop, and manage efficient MLOps pipelines tailored for Large Language Models, automating the deployment and lifecycle management of models in production.
  • Deploy, scale, and monitor LLM inference services across cloud-native environments using - Kubernetes, Docker, and other container orchestration frameworks.
  • Optimize LLM serving infrastructure for latency, throughput, and cost, including hardware acceleration setups with GPUs or TPUs.
  • Build and maintain CI/CD pipelines specifically for ML workflows, enabling automated validation, and seamless rollouts of continuously updated language models.
  • Implement comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack) to track model performance, resource utilization, and system health.
  • Collaborate cross-functionally with ML research and data science teams to operationalize fine-tuned models, prompt engineering experiments, and multi agentic LLM workflows.
  • Handle integration of LLMs with APIs and downstream applications, ensuring reliability, security, and compliance with data governance standards.
  • Evaluate, select, and incorporate the latest model-serving frameworks and tooling (e.g., Hugging Face Inference API, NVIDIA Triton Inference Server).
  • Troubleshoot complex operational issues impacting model availability and degradation, implementing fixes and preventive measures.
  • Stay up to date with emerging trends in LLM deployment, optimization techniques such as quantization and distillation, and evolving MLOps best practices.

What We’re Looking For

Experience & Skills:
  • 3 to 5 years of professional experience in Machine Learning Operations or ML Infrastructure engineering, including experience deploying and managing large-scale ML models.
  • Proven expertise in containerization and orchestration technologies such as Docker and Kubernetes, with a track record of deploying ML/LLM models in production.
  • Strong proficiency in programming with Python and scripting languages such as Bash for workflow automation.
  • Hands-on experience with cloud platforms (AWS, Google Cloud Platform, Azure), including compute resources (EC2, GKE, Kubernetes Engine), storage, and ML services.
  • Solid understanding of serving models using frameworks like Hugging Face Transformers or OpenAI APIs.
  • Experience building and maintaining CI/CD pipelines tuned to ML lifecycle workflows (evaluation, deployment).
  • Familiarity with performance optimization techniques such as batching, quantization, and mixed-precision inference specifically for large-scale transformer models.
  • Expertise in monitoring and logging technologies (Prometheus, Grafana, ELK Stack, Fluentd) to ensure production-grade observability.
  • Knowledge of GPU/TPU infrastructure setup, scheduling, and cost-optimization strategies.
Strong problem-solving skills with the ability to troubleshoot infrastructure and deployment issues swiftly and efficiently.
  • Effective communication and collaboration skills to work with cross-functional teams in a fast-paced environment.

Educational Background

  • Bachelor’s or Master’s degree from premier Indian institutes (IITs, IISc, NITs, BITS, IIITs etc.) in:
  • Computer Science, or
  • Any Engineering discipline, or
  • Mathematics or related quantitative fields.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You