Site Reliability Engineer

8 - 10 years

12 - 16 Lacs

Posted:Just now| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

System Monitoring and Incident Response: for implementing monitoring solutions to track system health, performance, and availability. They proactively monitor systems, identify issues, and respond to incidents promptly, working to minimize downtime and mitigate impacts. Post-Incident Analysis: Led incident response efforts, coordinated with cross-functional teams, and conducted post-incident analysis to identify root causes and implement preventive measures.

Continuous Improvement and Reliability Engineering: SREs drive continuous improvement efforts by identifying areas for enhancement, implementing best practices, and fostering a culture of reliability engineering. They participate in post-mortems, conduct blameless retrospectives, and drive initiatives to improve system reliability, stability, and maintainability.

Collaboration and Knowledge Sharing: SREs collaborate closely with software engineers, operations teams, and other stakeholders to ensure smooth coordination and effective communication. They share knowledge, provide technical guidance, and contribute to the development of a strong engineering culture.

Support and maintain configuration management for various applications and systems Implement comprehensive service monitoring, including dashboards, metrics, and alerts Define, measure, and meet key service level objectives, such as uptime, performance, incidents, and chronic problems Partner with application and business stakeholders to ensure high quality product development and release Collaborate with the development team to enhance system reliability and performance.Bachelors degree in Information Technology, Computer Science, or related field.
Strong knowledge of software development processes and procedures.
Strong problem-solving abilities.
Excellent understanding of computer systems, servers, and network systems.
Ability to work under pressure and manage multiple tasks simultaneously.
Strong communication and interpersonal skills.
Strong knowledge of coding languages like Python, Java, Go, etc.
Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++,
Ruby, and JavaScriptExperience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic
resource management frameworks (Apache Mesos, Kubernetes,Yarn)

Mandatory Key SkillsDocker,NFS,HDFS,Amazon S3,Python,Java,C,C++,Ruby,JavaScript,Apache Mesos,Kubernetes,Yarn,cloud computing*,Git*,Jenkins*,Ansible*,Terraform*

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Apex One logo
Apex One

Technology Solutions

Tech City

RecommendedJobs for You

bengaluru, karnataka, india

bengaluru, karnataka, india

bengaluru, karnataka, india

noida, uttar pradesh, india