Home
Jobs

Principal Site Reliability Engineer

7 - 12 years

16 - 20 Lacs

Posted:12 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

About Arctera
Arctera keeps the world s IT systems working. We can trust that our credit cards will work at the store, that power will be routed to our homes and that factories will produce our medications because those companies themselves trust Arctera.
Arctera is behind the scenes making sure that many of the biggest organizations in the world - and many of the smallest too - can face down ransomware attacks, natural disasters, and compliance challenges without missing a beat. We do this through the power of data and our flagship products, Insight, InfoScale and Backup Exec.
Illuminating data also helps our customers maintain personal privacy, reduce the environmental impact of data storage, and defend against illegal or immoral use of information.
It s a task that continues to get more complex as data volumes surge. Every day, the world produces more data than it ever has before. And global digital transformation - and the arrival of the age of AI - has set the course for a new explosion in data creation.
Joining the Arctera team, you ll be part of a group innovating to harness the opportunity of the latest technologies to protect the world s critical infrastructure and to keep all our data safe.
Overview
As a Principal Site Reliability Engineer, you will set the technical vision and drive architecture for platform reliability, cloud automation, and infrastructure security. You will design solutions and influence engineering culture to ensure the scalability, availability, and security of our services across Windows Server, Linux, Microsoft Azure, SQL Server, Terraform, Puppet, and Elasticsearch, operating in a large-scale production environment.
Key Responsibilities
  • Architect and own the reliability strategy across hybrid cloud environments primarily Azure with an emphasis on automation, scalability, and fault tolerance.
  • Lead the design and implementation of Infrastructure as Code (IaC) with Terraform for provisioning and lifecycle management.
  • Develop and oversee configuration management strategy using Puppet, ensuring consistency, security, and compliance at scale.
  • Guide high-level operational architecture for Windows Server and Linux platforms, identifying opportunities for simplification and modernization.
  • Drive cloud security and compliance by embedding controls into infrastructure, including access management, network hardening, vulnerability scanning, and encryption strategies.
  • Lead SQL Server infrastructure strategy, performance optimization, and database availability for mission-critical workloads.
  • Architect and scale Elasticsearch clusters for logs, telemetry, and analytics, ensuring high availability and performance under production workloads.
  • Champion observability and monitoring strategy, ensuring actionable insights from metrics, logs, and traces across all environments.
  • Mentor and elevate senior engineers, influence cross-team technical decisions, and act as a technical authority across the organization.
  • Represent SRE in technical architecture reviews, leadership forums, and cross-functional planning.
  • Lead incident response at the executive level when necessary, and drive company-wide learning through blameless postmortems and root cause analysis.
Preferred Qualifications
  • 7+ years of experience in SRE, infrastructure engineering, or DevOps roles, including time in high-scale production environments.
  • Proven experience with both Windows Server (2016/2019/2022) and Linux (Ubuntu, RHEL/CentOS) administration and automation.
  • Deep hands-on expertise with Microsoft Azure, including resource provisioning, networking, security, and cost management.
  • Expert in Terraform for enterprise IaC practices, including module development and state management.
  • Extensive experience with Puppet or Ansible for automated configuration and policy enforcement.
  • Strong SQL Server experience including HA/DR, performance optimization, and database security.
  • Real-world experience deploying and maintaining Elasticsearch clusters under production load.
  • Expert-level scripting in PowerShell, Bash, or Python for automation and tooling.
  • Deep knowledge of cloud security, identity management, and compliance frameworks (SOC 2, ISO 27001, HIPAA, etc.).
  • Exceptional communication, collaboration, and leadership skills.
  • Experience with containerization (Docker) and Kubernetes in production environments.
  • Familiarity with GitOps, CI/CD, and modern release engineering workflows.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You