Our Client is defining the future of cybersecurity through our XDR platform that automatically prevents, detects, and responds to threats in real time. Singularity XDR ingests data and leverages our patented AI models to deliver autonomous protection. With the Client, organizations gain full transparency into everything happening across the network at machine speed to defeat every attack at every stage of the threat lifecycle.We are a values-driven team where names are known, results are rewarded, and friendships are formed. Trust, accountability, relentlessness, ingenuity, and our client-centric approach define the pillars of our collaborative and unified global culture. Were looking for people who will drive team success and collaboration across SentinelOne. If youre enthusiastic about innovative approaches to problem-solving, we would love to speak with you about joining our team!What Are We Looking For?We are seeking a Site Reliability Engineer (SRE) with extensive operational experience managing large-scale SaaS infrastructures. You will be responsible for designing and maintaining data infrastructure that emphasizes automation, self-service, and scalability. This role is vital to ensuring that we meet and exceed our Service Level Objectives (SLOs) and uptime commitments to customers.You will partner closely with engineering teams to help them deliver software faster, safer, and with higher quality, while driving initiatives that enhance the reliability, stability, and cost efficiency of our production environments. Youll join a world-class team of like-minded SREs who manage complex, high-traffic systems that operate at global scale.What Will You Do?As a Site Reliability Engineer, you will play a critical role in ensuring the availability, scalability, and performance of SentinelOnes large-scale distributed systems. Working at the intersection of software development and operations, youll focus on making our infrastructure more reliable, automated, and efficient, while empowering development teams to deliver at speed and with confidence.

In This Role, You Will

Drive Continuous Deployment & Delivery Excellence :

Design, implement, and optimize CI/CD pipelines for efficient, secure, and reliable software releases.
Automate build, test, and deployment processes to enhance release velocity and reduce manual intervention.

Manage And Command Production Incidents

Lead the response to production incidents, ensuring timely mitigation and root cause identification.
Conduct post-incident reviews, define corrective actions, and drive continuous improvements to prevent recurrence.

Partner With Product Engineering Teams

Collaborate with product, platform, and infrastructure teams to embed reliability and scalability into design and architecture.
Provide technical guidance to improve system performance, fault tolerance, and observability.

Automate Operations And Streamline Processes

Build automation tools and frameworks that eliminate repetitive tasks, standardize operational procedures, and support a self-service infrastructure model for development teams.

Monitor, Measure, And Optimize Reliability

Establish metrics for system performance and reliability (availability, latency, throughput).
Proactively identify and resolve potential issues using data-driven insights and continuous monitoring.

Eliminate Infrastructure Bottlenecks

Analyze production systems to identify performance and scalability limitations.
Implement architectural improvements to enhance throughput, reliability, and cost efficiency across AWS and GCP environments.

Enhance Observability & Incident Readiness

Develop and maintain observability stacks with advanced monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, Datadog).
Conduct chaos engineering experiments to validate system resilience and ensure operational preparedness.

Ensure Security, Compliance & Resilience

Work with security and compliance teams to enforce secure configurations, data integrity, and regulatory adherence.
Participate in disaster recovery planning and capacity forecasting for high availability.

Mentor And Collaborate Across Teams

Share best practices through documentation, technical discussions, and internal workshops.
Foster a reliability-driven culture and promote continuous improvement across engineering functions.

What Skills and Experience Will You Need?

5+ years of experience managing large-scale SaaS operations or distributed systems
Strong expertise in orchestration systems like Kubernetes, Nomad, or Mesos
Proficiency in Python (preferred), Golang, or Java for automation and tooling
Hands-on experience running and deploying Java and JavaScript applications
Proven experience in AWS and GCP environments
Practical knowledge of Infrastructure as Code (Terraform, CloudFormation, etc.)
Experience with CI/CD tools such as Jenkins, GitHub Actions, or ArgoCD, and deployment strategies like blue-green, rolling, or canary deploys
Familiarity with SRE principles SLOs, SLIs, and error budgets
Strong problem-solving, communication, and collaboration skills within distributed teams
Self-starter attitude with a passion for automation, reliability, and continuous learning
Prior product development or software engineering experience is a strong plus

What We Offer

Flexible working format remote, office-based, or hybrid
Competitive salary and comprehensive compensation package
Personalized career growth opportunities and mentorship programs
Professional development tools: tech talks, training sessions, and centers of excellence
Active technical communities with regular knowledge-sharing
Education reimbursement for continued learning and certifications
Memorable milestone celebrations and company-sponsored events
Corporate gatherings and team-building initiatives

(ref:hirist.tech)

More Jobs at N-iX

Senior C++ Engineer with DICOM expertise

Bengaluru, Karnataka, India

Experience: Not specified

Salary: Not disclosed

Senior Full Stack Engineer

Bengaluru, Karnataka, India

5.0 - 5.0 yrs

Salary: Not disclosed

AI/ML Platform Engineer

bengaluru, karnataka, india

5.0 - 5.0 yrs

Salary: Not disclosed

Senior AI/ML Platform Engineer

india

5.0 - 5.0 yrs

Salary: Not disclosed

Senior MLOps Engineer (Ray.io)

india

5.0 - 5.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

N-iX

RecommendedJobs for You

Senior Site Reliability Engineer - AWS/Google Cloud Platform

N-iX

greater kolkata area

Senior Site Reliability Engineer - AWS/Google Cloud Platform

N-iX

greater kolkata area

Login to

Please Verify Your Phone or Email

Confirm Action

Senior Site Reliability Engineer - AWS/Google Cloud Platform