Job
Description
Are you excited about building cutting-edge AI systems that revolutionize how cloud infrastructure is run At Oracle Cloud Infrastructure (OCI), we're reimagining operations for the enterprise cloud, and we're looking for ML Engineers to drive this transformation. OCI is a hyperscale cloud platform where operations are core to customer success. For OCI to scale and remain competitive, we need to automate incident detection, diagnosis, and resolutionpushing the limits of AI and GenAI in cloud operations. Join us to build intelligent AI solutions that proactively manage cloud environments, boost operational excellence, and deliver a world-class customer experience. If you're passionate about solving deep technical challenges and shaping the next generation of autonomous cloud, this is your moment. Lead AI Agent Development: Drive the end-to-end design and development of AI agents tailored for cloud operations, ensuring scalability, reliability, and alignment with customer needs. Architect and implement tools and frameworks that accelerate AI agent development, experimentation, deployment, and monitoring. Design and execute robust methodologies to evaluate agent performance, safety, and accuracy across diverse operational scenarios. Develop advanced meta prompting techniques to dynamically adapt prompts and maximize LLM utility in varied, real-time operational contexts. Integrate cutting-edge LLM technologies into OCI systems, leveraging fine-tuning, retrieval-augmented generation (RAG), and other advanced techniques. Partner with product, UX, and engineering teams to align technical solutions with business priorities and user experience. Collaborate directly with enterprise customers to gather requirements, validate solutions, and ensure agent effectiveness in production environments. Mentor engineers, establish best practices in AI agent engineering, and contribute to the technical vision and roadmaps. Required Qualifications: - Overall 5+ with 2+ years of experience in machine learning engineering, with hands-on experience in building AI agents and production-grade ML systems. - Proven expertise in large language models (LLMs), transformers, and GenAI technologies. - Demonstrated experience in prompt engineering and developing prompting strategies for LLM-based applications. - Demonstrated ability to evaluate and optimize AI agent performance in real-world, mission-critical environments. - Solid programming skills in Python and practical experience with ML frameworks such as PyTorch or TensorFlow. - Deep understanding of MLOps practices, including model deployment, scalability, observability, and lifecycle management. - Excellent problem-solving skills with a bias for action and execution. - Ability to convince technical leaders and executives. - Strong written and verbal communication skills, including direct collaboration with enterprise customers and cross-org teams. - Bachelor's or Master's degree in Computer Science, Machine Learning, or a related technical field. Preferred Qualifications: - Experience building AI assistants or conversational agents for enterprise applications. - Experience working with multi-agent systems, reinforcement learning, or autonomous decision-making frameworks. - Familiarity with LLM fine-tuning, RAG (Retrieval-Augmented Generation), or vector database integration. - Background in cloud operations, DevOps, or infrastructure monitoring domains. - Familiarity with Oracle Cloud Infrastructure or other major cloud platforms. - Contributions to open-source ML projects and/or Publications or patents in ML, NLP, or AI agent systems. Why Join Us - Be at the forefront of designing AI Agents and the infrastructure that powers them. - Shape the future of AI-powered cloud operations at Oracle. - Work with cutting-edge AI technologies and solve impactful real-world problems. - Collaborate with a world-class team of engineers, researchers, and product leaders. - Competitive compensation, benefits, and career growth opportunities.,