Principal Site Reliability Engineer

6 - 12 years

0 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Oracle is seeking motivatedPrincipalSite Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability, security, performance, and reliability for our infrastructure. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that create risk for operations across the organization and resolving those issues with a mixture of engineering, development, troubleshooting expertise, and general operational guidance. This role also requires excellent communication and organizational skills. The candidate is expected to collaborate with service owners, other engineers and developers to deliver a superior support experience to development community.

Responsibilities

  • Troubleshoot and resolve complex issues related to Linux environments and Oracle Cloud Infrastructure (OCI)
  • Design and delivery of mission critical automation using Chef, Python with focus on security, resiliency, scale, and performance.
  • Design, develop, and implement AI-driven solutions for business challenges.
  • Collaborate with cross-functional teams to integrate AI models into production environments.
  • Identify opportunities and drive the implementation of automation to improve service health, availability and reliability
  • Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis
  • Quickly grasp and analyze new technologies that are complex and rapidly changing and integrate those into automation and infrastructure support.
  • Author functional and technical documentation and standard operating producers (SOP)
  • Collaborate with development teams in defining and implementing improvements in service architecture.
  • Articulate technical characteristics of services and technology areas and guide cross-functional teams to engineer and add capabilities to internal tools.

Knowledge Skills

  • 6- 12 years of experience in Site Reliability Engineering and automation.
  • Experience in Linux administration with expertise in kernel-level debugging and performance tuning.
  • Experience in cloud technologies and infrastructure management.
  • Expertise in design, development, and implementation of AI-driven solutions for business challenges.
  • Skilled in debugging operating system performance issues
  • Expertise in working with highly available, fault-tolerant, distributed systems.
  • Expertise in developing scripts and tools to automate routine tasks, improving efficiency.
  • Experience in troubleshooting application, compute, storage, and database issues to enhance reliability and scalability.
  • Strong background in operations management and problem resolution.
  • Development experience with Python and infrastructure management using Chef
  • Proven experience managing high-availability production environments.
  • Experience working with global teams across multiple time zones.

Qualifications required

  • 6 to 12 years of experience working in IT OperationsInfrastructure team

Bachelor degree in Computer Engineering, Software Engineering, Computer Science or related areas is preferred

Career Level - IC4

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Oracle logo
Oracle

Information Technology

Redwood City

RecommendedJobs for You