Staff Engineer - Site Reliability Engineering

7 - 12 years

9 - 14 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

About the Role

As a Staff Engineer - Site Reliability Engineering, youll work independently taking ownership of significant components and systems. Youll mentor junior contributors while driving technical excellence and reliability improvements across the platform.

Responsibilities

  • Kubernetes: Design and implement complex application lifecycle management, custom operators, and advanced troubleshooting
  • Infrastructure as Code: Architect comprehensive IaC solutions with advanced configurations and module development
  • Automation & Development: Design and build sophisticated automation frameworks and tools in Golang and Python
  • Component Ownership: Take full ownership of significant system components with responsibility for their reliability and performance; define SLA targets and drive achievement
  • Architecture & Design: Design components and systems with moderate complexity, evaluating tradeoffs and writing comprehensive design documents
  • Reliability Engineering: Contribute improvements that significantly improve product security, quality, ease of operation, reliability, and performance. Own reliability for major platform components
  • Automation Excellence: Implement automation frameworks that scale across teams; eliminate entire classes of manual work
  • Observability: Design comprehensive observability strategies; implement advanced monitoring and distributed tracing
  • Incident Management: Lead major incident response; establish incident management processes and runbooks
  • Performance Engineering: Drive system-wide performance improvements; establish performance engineering practices
  • Collaboration: Lead technical discussions with product engineering; influence architecture decisions across teams
  • Technical Leadership: Mentor junior engineers and provide technical guidance on complex problems; drive improvements to engineering processes and operational procedures

Qualifications

  • Experience: 6+ years with BS in designated Engineering or related field
  • Advanced Technical Skills: Expert-level proficiency in Golang and Python with system design experience
  • Cloud Architecture: Extensive experience designing and implementing cloud-native solutions
  • Infrastructure as Code: Deep understanding of Terragrunt/Terraform, including advanced features and best practices
  • Kubernetes: Advanced Kubernetes knowledge including operators and Helm
  • Observability: Expert knowledge of monitoring, logging, and observability solutions
  • Leadership: Demonstrated ability to mentor others and lead technical initiatives
  • Communication: Excellent written and verbal communication skills for design documentation and presentations

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Golang Skills

Practice Golang coding challenges to boost your skills

Start Practicing Golang Now
Aviatrix Systems logo
Aviatrix Systems

Cloud Networking

N/A

RecommendedJobs for You