We are looking for a senior engineer who specializes in LLM systems, prompt engineering, and agentic application deployment, combined with strong MLOps and cloud platform engineering experience.
You will design, deploy, and scale Generative AI models, retrieval-augmented generation (RAG) pipelines, and autonomous agent frameworks on OCI.
In this role, you'll work closely with data scientists, platform architects, and research teams to build production-grade AI systems, including:
- LLM finetuning and adaptation
- Prompt and prompt-chain optimization
- Multi-agent orchestration frameworks
- Automated evaluation and guardrail systems
- Model + data drift monitoring and continuous retraining workflows
You will be a key contributor to defining our AI platform architecture, ensuring operational scale, efficiency, security, and reliability.
Your responsibilities will include:
LLM & Agentic Development:
- Design, evaluate, and optimize prompts, prompt chains, and agent behaviors.
- Build and deploy RAG systems, vector search pipelines, and knowledge-grounding layers.
- Develop agent orchestration workflows using frameworks like LangChain, LlamaIndex, Guidance, or AG2.
- Integrate LLMs with external tools, APIs, and internal business systems.
LLMOps & Platform Engineering:
- Deploy and host open-source and proprietary LLMs on OCI (e.g., GPT, Llama, Mistral, Grok).
- Implement automated evaluation frameworks to measure truthfulness, relevance, safety, latency, and cost.
- Manage fine-tuning, LoRA adaptation, or embedding model selection.
Data Pipeline & Quality:
- Build pipelines that ensure data freshness, traceability, and semantic relevance for downstream LLM tasks.
- Use data validation frameworks (e.g., Great Expectations, Evidently) to detect drift or knowledge degradation.
Observability, Monitoring & Cost Optimization:
- Track LLM system performance, token usage, latency, and operational anomalies.
- Implement model guardrails, safety layers, and automated fallback behavior.
Collaboration & Mentorship
- Work directly with Data Science + Product to translate domain problems into LLM+Agent architectures.
- Mentor engineers and scientists on LLM deployment, prompt strategy, and evaluation methods.
- Work closely with architects, product teams, data engineers, and other stakeholders to deliver end-to-end AI solutions that address business needs.
Technical Skills:
- Strong Python engineering background.
- Experience with LLMs, RAG pipelines, or agent frameworks (LangChain, LlamaIndex, Haystack, AG2, etc.).
- Hands-on cloud infrastructure experience (OCI, AWS, GCP, or Azure).
- Experience with vector databases (e.g., Chroma, Pinecone, Weaviate, Milvus, PGVector).
- Experience with Kubernetes, Docker, and CI/CD automation.
Nice to Have:
- Experience fine-tuning or adapting LLMs (e.g., LoRA, QLoRA, RLHF, supervised finetuning).
- Prompt evaluation and automated testing frameworks (e.g., RAGAS, TruLens, DeepEval).
- Experience deploying microservices architectures in production environments.
Qualifications
- 8+ years of experience in software engineering, machine learning engineering, or platform engineering, with at least 2+ years focused on ML/AI systems in production.
- Hands-on experience developing or deploying Large Language Model (LLM) systems, including prompt engineering, RAG pipelines, agent-based workflows, or LLM fine-tuning.
- Strong proficiency in Python and experience with one or more LLM/agent frameworks (e.g., LangChain, LlamaIndex, Haystack, Guidance, AG2).
- Experience designing and operating cloud-native ML systems on OCI, AWS, GCP, or Azure.
- Proficiency with Kubernetes, Docker, and CI/CD pipelines for deploying and scaling services.
- Experience with data workflow orchestration (e.g., Airflow, Prefect, Dagster) and data validation frameworks (e.g., Great Expectations, Evidently).
- Strong understanding of vector databases (e.g., Pinecone, Weaviate, Milvus, Chroma, Postgres + pgvector).
- Demonstrated ability to build and maintain production monitoring, alerting, and observability dashboards (e.g., Prometheus, Grafana).
- Excellent communication and collaboration skills with the ability to mentor and lead technical discussions.
- Bachelor's or master's degree in computer science, engineering, or a related field, or equivalent practical experience.
Career Level - IC4