Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in pune
>
Emergys
>
SRE - AI ML Support Engineer

SRE - AI ML Support Engineer

Emergys

6 years

0 Lacs

pune maharashtra india

Posted:2 months ago| Platform:

Apply

Skills Required

ai ml support reliability service inference dataflow architecture deployment management scaling model software developer learning programming python pytorch linux troubleshooting triage resolve tracking documentation collaboration docker containerization scripting communication monitoring network kubernetes aws gcp azure openai tuning engineering

Work Mode

On-site

Job Type

Full Time

Job Description

Experience: 6+ years

NP-0 to 30 days

Please find JD:

We are hiring a “SRE [Site Reliability Engineer] AI ML Support” engineer for our “Enterprise-grade highperformance supercomputing” platform. We are helping enterprises and service providers build their AI

inference platforms for end users, powered by our state-of-the-art RDU (Reconfigurable Dataflow Unit)

hardware architecture. Our cloud-agnostic, enterprise-grade MLOps platform abstracts infrastructure

complexity and enables seamless deployment, management, and scaling of foundation model workloads at

production scale. You’ll contribute to the core of our enterprise-grade AI platform, collaborating across teams to

ensure our systems are performant, secure, and built to last. This is a high-impact, high-visibility role working

at the intersection of AI infrastructure, enterprise software, and developer experience.

Minimum Requirements:

• Foundational ML knowledge with hands-on experience working with machine learning models,

especially large language models (LLMs) and LLM APIs

• Strong programming skills in Python, including working with ML frameworks (PyTorch, Huggingface,

LangChain, etc) as well as building scripts, automation

• Solid understanding of Generative AI concepts (such as RAG) and applied use cases

• Exposure to Linux systems and familiarity with troubleshooting environment/setup issues

• Ability to investigate, triage, and resolve customer or internal issues related to ML workflows, APIs, and

AI-based applications

• Experience with issue tracking, documentation, and collaboration platforms (e.g., ticketing systems,

project tracking tools, knowledge bases)

• Proficiency with Docker for containerization and shell scripting for system automation

• Good communication and collaboration skills to work with cross-functional teams as well as external

customers or stakeholders

Nice to have:

• Familiarity with multi-modal models (e.g. Llama 4 Maverick)

• Familiarity with ML Ops practices – monitoring, observability, exposure to related libraries and

frameworks like OpenSearch, Prometheus and Grafana

• Strong hands-on exposure to Linux system administration and network administration, including

troubleshooting, system monitoring, and optimizing performance

• Experience working with Kubernetes (on-prem deployments preferred) for managing containerized ML

workloads

• Exposure to one or more public cloud platforms (AWS, GCP, Azure, etc)

• Strong customer-facing communication skills to handle escalations, reliability concerns, and solution

discussions with stakeholders and clients in a B2B environment

Ways to stand out from the crowd:

• Prior experience working with APIs and SDKs of major LLM providers (OpenAI, Anthropic, Hugging

Face, etc)

• Demonstrated ability to resolve complex issues in production ML systems

• Knowledge of fine-tuning, prompt engineering, and optimizing LLM usage in production.

Thanks

Aparna Surnis

More Jobs at Emergys

Senior Consultant

Pune, Maharashtra, India

5 - 5 yrs

Salary: Not disclosed

Digital Marketing Executive

Pune, Maharashtra, India

4 - 4 yrs

Salary: Not disclosed

BMC Helix Operations Management Engineer (BHOM)

Pune

5 - 10 yrs

INR 10 - 20 Lacs

Hadoop Admin

Pune, Maharashtra

Experience: Not specified

Salary: Not disclosed

Sr. Data Scientist

Pune, Maharashtra, India

7.0 - 7.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.