Posted:11 hours ago|
Platform:
Work from Office
Full Time
Title: Lead Machine Learning Engineer
Location: Vashi, Navi Mumbai
As a Lead Machine Learning Engineer, you will be the hands-on technical owner of ML systems that power large-scale data collection, extraction, enrichment, and understanding of unstructured content. You ll design, build, and operate end-to-end solutions from feature generation and training to low-latency inference and observability. These solutions will measurably improve coverage, freshness, quality, and unit cost across our data pipelines. Your toolbox spans classical ML, NLP, LLMs/GenAI, Agentic AI, Retrieval-Augmented Generation (RAG) frameworks, and Model Context Protocol (MCP). You will use these to deliver retrieval, extraction, classification, summarization, and autonomous tasking capabilities integrated cleanly into production workflows.
You ll own the architecture and implementation across AWS and GCP clouds, selecting managed services pragmatically and deploying resilient services via Docker and Kubernetes with CI/CD, autoscaling, canary/shadow releases, and tight SLIs/SLOs. You will institute MLOps best practices experiment tracking, model and prompt registries, evaluation harnesses, data/feature drift detection, guardrails and policy enforcement, lineage and access controls so teams can ship faster with confidence. Day to day, you ll write production-grade Python and SQL, apply GitHub Copilot to accelerate development responsibly, and partner with Product, Data, Platform/SRE, and Security to translate ambiguous problems into staged, observable deliveries.
You bring a curiosity to understand the domain by studying the applications, dataflow, and data schemas, and you use that context to design simpler, more accurate systems. It s a plus if you have familiarity with public and private equity data and related entity models, enabling smarter features, evaluation sets, and downstream integrations. As a lead IC, you mentor through design and code reviews, set technical direction, and improve reliability, security, and developer experience. You will champion cost-aware, privacy-first designs; lead deep dives to resolve complex issues; and iterate quickly to achieve measurable outcomes (precision/recall, latency, error budgets, and cost per document). This role is ideal for an engineer who thrives on shipping robust ML/LLM systems at scale and influencing cross-functional teams through exceptional technical judgment and execution.
Team Overview
You will be part of a multidisciplinary team of ML engineers and data scientists responsible for building AI & ML solutions and services as part of robust data collection pipelines handling large volumes of unstructured data. Team will focus on building scalable and reliable systems to process and categorize data that is essential for downstream data collection processing.
Outline of Duties and Responsibilities
AI & ML Data Collection Leadership: Convert business goals into a clear AI/ML roadmap for data acquisition, extraction, enrichment, and measurable outcomes.
Technical Oversight: Architect and ship scalable ML/NLP/LLM (RAG, embeddings, reranking, Agentic AI, MCP) services with high reliability and efficiency.
Peer Leadership & Development: Mentor engineers and data scientists through design/code reviews, setting technical standards and elevating craftsmanship.
NLP Technologies: Build and integrate classifiers, transformers, LLMs, and evaluators that process and categorize unstructured data at scale.
Data Pipeline Engineering: Design, operate, and optimize high-throughput collection pipelines with robust orchestration, messaging, storage, and SLAs.
Cross-functional Collaboration: Partner with Product, Data Collection Engineering, Platform/SRE, and Security to turn ambiguous needs into phased, observable deliveries.
Innovation & Continuous Improvement: Pilot and productionize advances in GenAI, Agentic AI, RAG, and MCP to improve quality, speed, and cost.
System Integrity & Security: Enforce data governance, privacy, and model transparency with least-privilege IAM, secrets management, and auditability.
Process Improvement: Apply Agile/Lean/Fast-Flow practices to reduce cycle time, raise quality, and remove toil via automation.
Cloud & Deployment: Deliver cloud-native solutions on AWS and GCP using Docker/Kubernetes, autoscaling, and progressive delivery patterns.
MLOps & Reliability: Establish experiment tracking, registries, CI/CD, drift detection, SLIs/SLOs, and runbooks for dependable operations.
Retrieval Quality & Evaluation: Implement offline/online evals (e.g., nDCG/MRR/precision@k), golden sets, and guardrails for RAG and search relevance.
Cost, Performance & Observability: Optimize latency and unit cost with caching, batching, distillation, right-sizing, and clear dashboards/alerts.
Documentation & Knowledge Sharing: Produce concise design docs, ADRs, and playbooks to ensure durable, cross-site knowledge transfer.
Experience, Skills and Qualifications
Bachelor s, Master s, or PhD in Computer Science, Mathematics, Data Science, or a related field.
5+ years of experience in the ML Engineering and Data Science field, with a focus on LLM and GenAI technologies, particularly in data collection and unstructured data processing.
1+ years of experience in technical lead position.
Strong expertise in NLP and machine learning, with hands-on experience in classifiers, large language models (LLMs), Model Context Protocol (MCP), Agentic AI, and other advanced NLP techniques.
Extensive experience with data pipeline and messaging technologies such as Apache Kafka, Airflow, and cloud data platforms (e.g., Snowflake).
Expert-level proficiency in Python, SQL, and other relevant programming languages and tools.
Proficiency in Amazon Web Services (AWS) and Google Cloud Platform (GCP).
Strong understanding of cloud-native technologies and containerization (e.g., Kubernetes, Docker) with experience in managing these systems globally.
Demonstrated ability to solve complex technical challenges and deliver scalable solutions.
Excellent communication skills with a collaborative approach to working with global teams and stakeholders.
Experience working in fast-paced environments, particularly in industries that rely on data-intensive technologies (experience in fintech is highly desirable).
Working Conditions
The job conditions for this position are in a standard office setting. Employees in this position use PC and phones on an ongoing basis throughout the day. Limited corporate travel may be required to remote offices or other business meetings and events.
Morningstars hybrid work environment gives you the opportunity to collaborate in-person each week as weve found that were at our best when were purposely together on a regular basis. In most of our locations, our hybrid work model is four days in-office each week. A range of other benefits are also available to enhance flexibility as needs change. No matter where you are, youll have tools and resources to engage meaningfully with your global colleagues.
I10_MstarIndiaPvtLtd Morningstar India Private Ltd. (Delhi) Legal Entity
Morningstar
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now
navi mumbai
9.0 - 13.0 Lacs P.A.
9.0 - 13.0 Lacs P.A.
8.0 - 12.0 Lacs P.A.
bengaluru
15.0 - 18.0 Lacs P.A.
nagercoil, tamil nadu
Salary: Not disclosed
Salary: Not disclosed
noida, hyderabad, bengaluru
10.0 - 19.0 Lacs P.A.
bengaluru
4.2 - 7.0 Lacs P.A.
hyderābād
Salary: Not disclosed
10.0 - 14.0 Lacs P.A.