Job Title:
ML Ops Engineer – GenAI & ML Solutions
Location:
Pune / Mumbai (Hybrid Work Model)
Experience:
3–5 Years (Minimum 2 Years in AI Development)
Industry:
Credit Rating & Financial Analytics
About Us:
Join a leading global credit rating and financial analytics powerhouse where innovation meets finance! We leverage cutting-edge AI and analytics to deliver state of the art solutions. As we embark on our next growth phase, we’re looking for a passionate AI Engineer to propel our AI capabilities into the future — redefining how financial intelligence is powered and delivered.
Why Join Us?
- Be at the forefront of AI innovation in finance technology with exposure to next-gen GenAI techniques.
- Collaborate with dynamic global teams spanning finance, technology, and client solutions.
- Work in a hybrid setup in Pune or Mumbai, blending flexibility with the spirit of teamwork.
- Opportunity to directly influence client-facing AI solutions impacting real-world business outcomes.
- Grow your career in an environment that champions best coding practices, continuous learning, and breakthrough AI deployments on cloud platforms.
What You’ll Do:
- Develop, and manage efficient MLOps pipelines tailored for Large Language Models, automating the deployment and lifecycle management of models in production.
- Deploy, scale, and monitor LLM inference services across cloud-native environments using - Kubernetes, Docker, and other container orchestration frameworks.
- Optimize LLM serving infrastructure for latency, throughput, and cost, including hardware acceleration setups with GPUs or TPUs.
- Build and maintain CI/CD pipelines specifically for ML workflows, enabling automated validation, and seamless rollouts of continuously updated language models.
- Implement comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack) to track model performance, resource utilization, and system health.
- Collaborate cross-functionally with ML research and data science teams to operationalize fine-tuned models, prompt engineering experiments, and multi agentic LLM workflows.
- Handle integration of LLMs with APIs and downstream applications, ensuring reliability, security, and compliance with data governance standards.
- Evaluate, select, and incorporate the latest model-serving frameworks and tooling (e.g., Hugging Face Inference API, NVIDIA Triton Inference Server).
- Troubleshoot complex operational issues impacting model availability and degradation, implementing fixes and preventive measures.
- Stay up to date with emerging trends in LLM deployment, optimization techniques such as quantization and distillation, and evolving MLOps best practices.
What We’re Looking For:
Experience & Skills:
- 3 to 5 years of professional experience in Machine Learning Operations or ML Infrastructure engineering, including experience deploying and managing large-scale ML models.
- Proven expertise in containerization and orchestration technologies such as Docker and Kubernetes, with a track record of deploying ML/LLM models in production.
- Strong proficiency in programming with Python and scripting languages such as Bash for workflow automation.
- Hands-on experience with cloud platforms (AWS, Google Cloud Platform, Azure), including compute resources (EC2, GKE, Kubernetes Engine), storage, and ML services.
- Solid understanding of serving models using frameworks like Hugging Face Transformers or OpenAI APIs.
- Experience building and maintaining CI/CD pipelines tuned to ML lifecycle workflows (evaluation, deployment).
- Familiarity with performance optimization techniques such as batching, quantization, and mixed-precision inference specifically for large-scale transformer models.
- Expertise in monitoring and logging technologies (Prometheus, Grafana, ELK Stack, Fluentd) to ensure production-grade observability.
- Knowledge of GPU/TPU infrastructure setup, scheduling, and cost-optimization strategies.
Strong problem-solving skills with the ability to troubleshoot infrastructure and deployment issues swiftly and efficiently.
- Effective communication and collaboration skills to work with cross-functional teams in a fast-paced environment.
Educational Background:
- Bachelor’s or Master’s degree from premier Indian institutes (IITs, IISc, NITs, BITS, IIITs etc.) in:
- Computer Science, or
- Any Engineering discipline, or
- Mathematics or related quantitative fields.
Benefits:
- Hybrid work model combining remote flexibility and collaborative office culture in Pune or Mumbai.
- Continuous learning budget for certifications, workshops, and conferences.
- Opportunities to work on industry-leading AI research and shaping the future of financial services.
Step into the Future of Financial technology with AI – Apply Now!
If you are eager to push boundaries of AI in financial analytics and thrive in a global, fast-paced environment, we want to hear from you!