SRE Lead

1.0 - 4.0 years

3.0 - 6.0 Lacs P.A.

Bengaluru

Posted:1 week ago| Platform: Naukri logo

Apply Now

Skills Required

AutomationTeam managementComplianceISOInfrastructure managementNetwork securityIncident managementIPSMonitoringPython

Work Mode

Work from Office

Job Type

Full Time

Job Description

Nexthink is looking for a Lead Site Reliability Engineer who is passionate about building and running a high-performance cloud platform and enabling best-in-class site reliability and operations practices. This role will support Nexthink operations globally. The candidate will drive the development of modern, cloud-native SRE processes and the management and operations for Nexthink s multi-tenant, microservices-based cloud platform. The platform has multiple instances deployed across the globe. This role involves working closely with cross-functional teams to integrate reliability and security into our systems, ensuring they meet standards. The ideal candidate will have extensive experience in both software engineering and systems administration, with a strong understanding of SRE concepts, requirements and security practices. Leadership and Team Management: Lead, mentor, and develop a team of India-based Site Reliability Engineers. Foster a culture of continuous improvement, collaboration, and innovation. Infrastructure Management: Oversee the design, deployment, and management of scalable and secure cloud infrastructure. Drive automation of infrastructure provisioning, configuration, and management using Infrastructure as Code (IaC) tools. Monitoring and Performance: Develop and maintain comprehensive monitoring, logging, and alerting systems to ensure high availability and performance. Lead efforts in performance tuning and optimization for applications and infrastructure. Security and Compliance: Ensure implementation and maintenance of security controls and best practices to achieve compliance with standards and certifications. Conduct and oversee regular security assessments, vulnerability scans, and penetration testing. Collaborate with the compliance team to prepare for and respond to audits. Incident Management: Lead incident management efforts, ensuring rapid resolution and thorough root cause analysis. Develop and implement strategies for improving incident response and minimizing downtime. Collaboration and Communication: Work closely with development, operations, and security teams to integrate reliability and security into the software development lifecycle. Communicate effectively with stakeholders, providing regular updates on system performance, reliability, and compliance status. Bachelor s degree in Computer Science, Engineering, or a related field (or equivalent experience). 5+ years of experience in site reliability engineering, DevOps, or a related role, with at least 2 years in a leadership position.

Nexthink
Nexthink

Software Development

Prilly Canton de Vaud +

1001-5000 Employees

7 Jobs

    Key People

  • Pedro Bados

    CEO & Co-Founder
  • Martin Hargreaves

    Chief Product Officer

RecommendedJobs for You

Pune, Hyderabad, Gurgaon