Principal Software Site Reliability Engineer

4 - 8 years

0 Lacs

Posted:6 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You will be combining software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. As a Site Reliability Engineer (SRE) at Red Hat, you will ensure that services, both internally critical and externally-visible systems, have reliability and uptime appropriate to users" needs with a focus on capacity and performance. Your role will involve using a variety of tools and approaches to solve a broad spectrum of problems, including practices like blameless postmortems and proactive identification of potential outages for iterative improvement. - Implement and improve the whole lifecycle of services, from inception and design to monitoring, metrics, deployment, operation, and refinement. - Support services before they go live through activities like system design consulting, developing software platforms, capacity planning, and launch reviews. - Maintain services post-launch by measuring and monitoring availability, latency, and overall system health. - Collaborate with internal and external teams on release management activities and develop automated deployment and testing tools. - Scale systems sustainably through automation mechanisms and advocate for changes that enhance reliability and velocity. - Provide best effort off-hours support and work as part of an Agile team to communicate status proactively and meet deliverables on schedule. - Propose and implement continuous improvement activities, standardize and document common DevOps procedures. - Participate in the development of new features and bug fixes on Red Hat software services. - Practice sustainable incident response and blameless postmortems and drive ticket resolution for key applications and platforms. - Experience in Linux or UNIX systems administration, supporting critical production systems in an enterprise environment. - Knowledge of OpenShift/Kubernetes, Docker, containers, Ansible, Chef, Python, Ruby, Bash, and code deployments across on-premise and cloud environments such as AWS. - Experience designing and deploying highly scalable and resilient applications and platforms. - Additional experience in Java, Golang, Ruby development, GitLab Pipeline, GitHub Actions for automation, Red Hat Certified Engineer (RHCE), content delivery networks like Akamai, multi-tasking, excellent communication, and team collaboration skills. - Familiarity with agile project methodologies such as Scrum or Kanban.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You