Site Reliability Engineer

5 - 7 years

10 - 20 Lacs

Posted:1 month ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Site Reliability Engineer As a Senior Site Reliability Engineer, you will play a critical role in supporting application developers by providing expert guidance on Application and infrastructure best practices from reliability perspective. Your role covers the entire life cycle of a product/application. Your primary focus will be Automation, Observability, reliability and Release management with CICD with an emphasis on solving operations issues Must have at least 5+ years of SRE experience in large programs with focus on release engineering, observability tasks and reliability Must have good understanding of Site Reliability Engineering (SRE) and release management processes should possess strong analytical and troubleshooting skills Should be a strong team player and enjoy collaborating with different people and profiles as well as share knowledge and strive for continuous development and learning. Excellent communication skills along with leadership skills Responsibilities (includes but not limited) Improve reliability, quality, and time-to-market of our suite of products/applications. Define suitable metrics for system with SLO/SLI and setup observability mechanism to track it Define error budget as per the SLO Define strategy and setup up High Availability and Load Balancer based architecture Drive a metrics-driven culture and software delivery process using data to measure overall system quality and reliability. Balance feature development speed and reliability with well-defined service level objectives Provide primary operational support and engineering for products/applications Partner with solution architect and development teams to improve services reliability Participate in system design, infra management and capacity planning Participate in optimizing code, automating operational tasks and toil reduction Provide solutions for performance management, disaster recovery, monitoring and observability Work with business users to understand issues, develop root cause analysis and work with the development team for enhancements/fixes Working on distributed traces to visualize the entire workflow and analyze the cause of problems/incidents Improve security and performance of infrastructure and applications Provide support, improve, and implement infrastructure as code Define, evangelize, and maintain SRE best practices Solutionize and implement DevSecOps best practices Improve automation including systems self-healing capability Manage and participate in on-call incidents (Priority Incident) Skills Good experience in scripting or development languages, including expertise in Python, Ruby, JSON, Java, and Node.JS, PHP (anyone) Experience with scripting in PowerShell(M) and Bash/Shell/Perl (anyone) Strong experience on one or more Observability tools like New Relic, AppDynamics, Prometheus, Dynatrace, DataDog, Splunk, Experience in Observability Dashobard creation, custom metrics, Synthetic Monitoring and Real User Monitoring (RUM) Strong knowledge of microservices architecture with APIs and REST API’s Experience in CICD tooling and best practices Experience of Cloud platforms such as AWS, Azure, and Google Experience in container orchestration and practices, including Kubernetes, Docker Swarm Experience in infrastructure automation tools like Terraform, Cloud Formation, Ansible, and Puppet (Anyone) Systems Administration and operating system experience on Linux, windows, including an understanding of networking. Knowledge on SQL, NoSQL (Oracle, Couchbase) Experience working on tools like Remedy, ServiceNow, Confluence, Jira Experience on Chaos engineering (good to have) Experience with Cloud cost optimization (Good to have) Knowledge on message broker application such as RabbitMQ, Kafka or ActiveMQ (good to have)

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Infosys logo
Infosys

IT Services and IT Consulting

Bangalore Karnataka

RecommendedJobs for You

Bengaluru, Karnataka, India

Hyderabad, Telangana, India

Bengaluru, Karnataka, India

Noida, Uttar Pradesh, India

Hyderabad, Telangana, India

Pune, Maharashtra, India