Principal LLM & MLOps Engineer

8 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are looking for a senior engineer who specializes in LLM systems, prompt engineering, and agentic application deployment, combined with strong MLOps and cloud platform engineering experience.You will design, deploy, and scale Generative AI models, retrieval-augmented generation (RAG) pipelines, and autonomous agent frameworks on OCI.In this role, you’ll work closely with data scientists, platform architects, and research teams to build production-grade AI systems, including:
  • LLM finetuning and adaptation
  • Prompt and prompt-chain optimization
  • Multi-agent orchestration frameworks
  • Automated evaluation and guardrail systems
  • Model + data drift monitoring and continuous retraining workflows
You will be a key contributor to defining our AI platform architecture, ensuring operational scale, efficiency, security, and reliability.

Responsibilities

Your responsibilities will include:

LLM & Agentic Development:

  • Design, evaluate, and optimize prompts, prompt chains, and agent behaviors.
  • Build and deploy RAG systems, vector search pipelines, and knowledge-grounding layers.
  • Develop agent orchestration workflows using frameworks like LangChain, LlamaIndex, Guidance, or AG2.
  • Integrate LLMs with external tools, APIs, and internal business systems.

LLMOps & Platform Engineering:

  • Deploy and host open-source and proprietary LLMs on OCI (e.g., GPT, Llama, Mistral, Grok).
  • Implement automated evaluation frameworks to measure truthfulness, relevance, safety, latency, and cost.
  • Manage fine-tuning, LoRA adaptation, or embedding model selection.

Data Pipeline & Quality:

  • Build pipelines that ensure data freshness, traceability, and semantic relevance for downstream LLM tasks.
  • Use data validation frameworks (e.g., Great Expectations, Evidently) to detect drift or knowledge degradation.

Observability, Monitoring & Cost Optimization:

  • Track LLM system performance, token usage, latency, and operational anomalies.
  • Implement model guardrails, safety layers, and automated fallback behavior.

Collaboration & Mentorship

  • Work directly with Data Science + Product to translate domain problems into LLM+Agent architectures.
  • Mentor engineers and scientists on LLM deployment, prompt strategy, and evaluation methods.
  • Work closely with architects, product teams, data engineers, and other stakeholders to deliver end-to-end AI solutions that address business needs.

Technical Skills:

  • Strong Python engineering background.
  • Experience with LLMs, RAG pipelines, or agent frameworks (LangChain, LlamaIndex, Haystack, AG2, etc.).
  • Hands-on cloud infrastructure experience (OCI, AWS, GCP, or Azure).
  • Experience with vector databases (e.g., Chroma, Pinecone, Weaviate, Milvus, PGVector).
  • Experience with Kubernetes, Docker, and CI/CD automation.

Nice to Have:

  • Experience fine-tuning or adapting LLMs (e.g., LoRA, QLoRA, RLHF, supervised finetuning).
  • Prompt evaluation and automated testing frameworks (e.g., RAGAS, TruLens, DeepEval).
  • Experience deploying microservices architectures in production environments.

Qualifications

:
  • 8+ years of experience in software engineering, machine learning engineering, or platform engineering, with at least 2+ years focused on ML/AI systems in production.
  • Hands-on experience developing or deploying Large Language Model (LLM) systems, including prompt engineering, RAG pipelines, agent-based workflows, or LLM fine-tuning.
  • Strong proficiency in Python and experience with one or more LLM/agent frameworks (e.g., LangChain, LlamaIndex, Haystack, Guidance, AG2).
  • Experience designing and operating cloud-native ML systems on OCI, AWS, GCP, or Azure.
  • Proficiency with Kubernetes, Docker, and CI/CD pipelines for deploying and scaling services.
  • Experience with data workflow orchestration (e.g., Airflow, Prefect, Dagster) and data validation frameworks (e.g., Great Expectations, Evidently).
  • Strong understanding of vector databases (e.g., Pinecone, Weaviate, Milvus, Chroma, Postgres + pgvector).
  • Demonstrated ability to build and maintain production monitoring, alerting, and observability dashboards (e.g., Prometheus, Grafana).
  • Excellent communication and collaboration skills with the ability to mentor and lead technical discussions.
  • Bachelor’s or master’s degree in computer science, engineering, or a related field, or equivalent practical experience.

Qualifications

Career Level - IC4

About Us

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States.Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Oracle logo
Oracle

Information Technology

Redwood City

RecommendedJobs for You