Site Reliability Engineer

5 - 10 years

20.0 - 25.0 Lacs P.A.

Gurgaon

Posted:3 months ago| Platform: Naukri logo

Apply Now

Skills Required

PrometheusGrafanaKubernetesReliability EngineeringElkSreSite Reliability Engineering

Work Mode

Work from Office

Job Type

Full Time

Job Description

Title: Site Reliability Engineer- L2 Support Exp- 5+ yrs Job Description : Kubernetes Expertise : Lead and manage complex Kubernetes deployments, including troubleshooting issues related to cluster operations, deployments, scaling, and performance. Deployment Troubleshooting : Provide expert-level troubleshooting skills for all types of application and infrastructure-related issues, ensuring quick resolution with minimal downtime. Application Troubleshooting : Work closely with development teams to identify and resolve issues in Node.js or Golang applications, including memory leaks, performance bottlenecks, and resource contention. Monitoring and Observability : Implement and manage monitoring tools such as Dynatrace, Prometheus, Grafana, Loki, and Tempo. Design and configure dashboards, alerts, and health checks to track application and infrastructure performance. Performance Optimization : Identify performance issues within the Kubernetes environment, application code, and databases. Propose and implement solutions for optimization and scalability. Database Management : Troubleshoot and optimize SQL database issues, including performance, indexing, and query optimization. Collaboration : Work alongside development and operations teams to provide technical insights and resolve issues in production and staging environments. Documentation : Maintain proper documentation for troubleshooting procedures, deployment setups, and monitoring configurations.

RecommendedJobs for You

Hyderabad, Telangana, India