SRE II - Observability & Reliability

5 - 9 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Senior Software Engineer in the Site Reliability Engineering team focusing on Observability and Reliability, your role will be crucial in ensuring the performance, stability, and availability of applications and systems. Your primary responsibility will be designing, implementing, and maintaining the observability and reliability infrastructure, with a specific emphasis on the ELK stack (Elasticsearch, Logstash, and Kibana). You will also collaborate with cross-functional teams to define and implement observability and reliability standards and best practices. Key Responsibilities: - Collaborate with cross-functional teams to define and implement observability and reliability standards and best practices. - Design, deploy, and maintain the ELK stack for log aggregation, monitoring, and analysis. - Develop and maintain alerts and monitoring systems for early issue detection and rapid incident response. - Create, customize, and maintain dashboards in Kibana for different stakeholders. - Identify performance bottlenecks, recommend solutions, and collaborate with software development teams. - Automate manual tasks and workflows to streamline observability and reliability processes. - Conduct system and application performance analysis, optimization, capacity planning, security practices, compliance adherence, documentation, disaster recovery, and backup. - Generate and deliver detailed reports on system performance and reliability metrics. - Stay updated with industry trends and best practices in observability and reliability engineering. Qualifications/Skills/Abilities: - Formal Education: Bachelor's degree in computer science, Information Technology, or a related field (or equivalent experience). - Experience: 5+ years of experience in Site Reliability Engineering, Observability & reliability, DevOps. - Skills: - Proficiency in configuring and maintaining the ELK stack (Elasticsearch, Logstash, Kibana) is mandatory. - Strong scripting and automation skills, expertise in Python, Bash, or similar languages. - Experience in Data structures using Elasticsearch Indices and writing Data Ingestion Pipelines using Logstash. - Experience with infrastructure as code (IaC) and configuration management tools (e.g., Ansible, Terraform). - Hands-on experience with cloud platforms (AWS preferred) and containerization technologies (e.g., Docker, Kubernetes). - Good to have Telecom domain expertise but not mandatory. - Strong problem-solving skills and the ability to troubleshoot complex issues in a production environment. - Excellent communication and collaboration skills. - Accreditation/Certifications/Licenses: Relevant certifications (e.g., Elastic Certified Engineer) are a plus.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru