AI Data Platform Reliability & Validation Engineer

3 years

0 Lacs

Posted:3 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Summary:

Oracle's AI Data Platform is accelerating enterprise AI and redefining how AI applications are built. The AI Data Platform team is seeking an experience engineer to help drive AI platform reliability. This role is vital to ensuring our enterprise-scale, AI-powered data platform is robust, performant, and reliable. You will develop and execute end-to-end scenario tests across distributed systems, You will design and execute end-to-end scenario tests across distributed systems, and partner with engineering and architecture teams to develop tooling that improves and maintains the platform. You will also embed operational excellence by applying modern SRE practices.

Responsibilities

  • Design, develop, and execute end-to-end (E2E) scenario validations that simulate real-world usage of complex AI data platform workflows (data ingestion, transformation, ML pipeline orchestration, etc.).
  • Collaborate closely with product, engineering, and field teams to identify gaps in coverage and propose test automation strategies.
  • Develop and maintain automated test frameworks supporting E2E, integration, performance, and regression testing for distributed data/AI services
  • Monitor system health across the stack (infrastructure, data pipelines, AI/ML workloads), proactively detect failures or SLA breaches.
  • Champion SRE best practices including observability, incident management, blameless postmortems, and runbook automation.
  • Analyze logs, traces, and metrics to identify reliability, latency, and scalability issues; drive root cause analysis and corrective actions.
  • Partner with engineering to drive high-availability, fault tolerance, and continuous delivery (CI/CD) improvements.
  • Participate in on-call rotation to support critical services, ensuring rapid resolution and minimizing customer impact.

Desired Qualifications:

  • Bachelor’s or master’s degree in computer science, Engineering, or related field (or demonstrated equivalent experience)
  • 3+ years’ experience in software QA/validation, SRE, or DevOps roles, ideally in data platforms, cloud, or AI/ML environments.
  • Proficient with DevOps automation and tools for continuous integration, deployment, and monitoring (e.g., Terraform, Jenkins, GitLab CI/CD, Prometheus). 
  • Working knowledge of distributed systems, data engineering pipelines, and cloud-native architectures (OCI, AWS, Azure, GCP, etc.).
  • Strong proficiency in Java, Python and related technologies
  • Hands-on experience with test automation frameworks (e.g., Selenium, pytest, JUnit) and scripting (Python, Bash, etc.).
  • Familiarity with SRE practices: service-level objectives (SLO/SLA), incident response, observability (Prometheus, Grafana, ELK, etc.).
  • Strong troubleshooting and analytical skills with a passion for reliability engineering and process automation.
  • Excellent communication and cross-team collaboration abilities.

Qualifications

Career Level - IC2

About Us

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States.Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now
Oracle logo
Oracle

Information Technology

Redwood City

RecommendedJobs for You