Site Reliability Engineer(Application Focus)

10 - 15 years

15 - 30 Lacs

Posted:-1 days ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Site Reliability Engineer(Application Focus)

The customer

The project

Responsibilities:

- Ensuring high availability, performance, scalability, and overall reliability of application infrastructure through proactive monitoring, automation, and continuous improvement.

- Developing and implementing performance optimization strategies, including code optimization, memory management, load testing, and capacity planning.

- Implementing and maintaining end-to-end observability, including real-time telemetry, CUJ-level metrics, dashboards, alerts, and actionable reporting.

- Monitoring Critical User Journeys (CUJs) with product and business teams to improve end-to-end user experience and service reliability.

- Managing SLIs, SLOs, SLAs, and error budgets across critical services while ensuring uptime and availability targets are consistently met.

- Implementing next-generation architectural patterns and SRE recommendations to enhance fault tolerance, resilience, and disaster recovery capabilities.

- Identifying and mitigating reliability risks, proactively addressing issues that may impact availability and minimizing service disruptions.

- Automating key operational tasks such as deployments, scaling, failover, and remediation, and reducing manual toil through tools and process improvements.

- Leading incident response efforts, participating in on-call rotations, and driving automated remediation for common failure scenarios.

- Performing root-cause analysis, conducting blameless post-mortems, and implementing corrective actions to prevent recurring incidents.

- Creating and maintaining comprehensive runbooks, operational documentation, and guidelines for incident response and system reliability.

- Collaborating with global and regional digital teams on reliability best practices, mentoring junior SREs, and contributing to the hiring and onboarding of new SRE candidates.

Must-haves:

- Experience in application support and reliability engineering environments for 10+ years.

- Strong technical background with proficiency in software development principles, application production support, SDLC best practices, and Agile methodology.

- Hands-on SRE skills, including familiarity with SLOs, SLIs, error budgets, incident management, and conducting blameless post-mortems.

- Solid understanding of application architectures with the ability to analyze systems and identify areas for improvement.

- Experience working with monitoring, logging, and observability tools to track and optimize application performance.

- Proficiency in scripting and automation tools (e.g., Python, Bash, Terraform) to reduce toil and improve operational efficiency.

- Strong incident response and troubleshooting skills with the ability to perform effective root cause analysis.

- Excellent collaboration and communication skills for working with cross-functional teams and clearly explaining technical concepts.

- Ability to coach and mentor team members in SRE practices and foster a culture of reliability.

- Proactive mindset with a focus on continuous improvement to enhance application reliability and performance.

- Level of English from Intermediate+ and above.

Reasons why this job would be interesting to you:

- Experience in teamwork with leaders in FinTech, Healthcare, Retail, Telecom, and others. Andersen cooperates with such businesses as Samsung, Siemens, Johnson & Johnson, BNP Paribas, Ryanair, Mercedes, TUI, Verivox, Allianz, T-Systems, etc..

- The opportunity to change the project and/or develop expertise in an interesting business domain.

- Job conditions you can work both fully remotely and from the office or can choose a hybrid variant.

- Guarantee of professional, financial, and career growth! The company has introduced systems of mentoring and adaptation for each new employee.

- The opportunity to earn up to an additional 1,000 USD per month, depending on the level of expertise, which will be included in the annual bonus, by participating in the company's activities.

- Access to the corporate training portal, where the entire knowledge base of the company is collected and which is constantly updated.

- Bright corporate life (parties / pizza days / PlayStation / fruits / coffee / snacks / movies).

- Certification compensation (AWS, PMP, etc).

- Referral program.

- English courses.

- Private health insurance and compensation for sports activities.

Join us!!!

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You