Jobs
Interviews

1 Multiregion Failover Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

10.0 - 14.0 years

0 Lacs

karnataka

On-site

As a Senior/Lead Site Reliability Engineer (SRE) at our company, you will play a crucial role in architecting, scaling, and mentoring the reliability efforts of our cloud-based communication platforms. Your primary responsibility will be to design highly available systems, manage VoIP/SMS infrastructure at scale, and ensure mobile communication reliability across various platforms. Your key responsibilities will include architecting, deploying, and managing large-scale Linux-based communication platforms hosted on AWS. You will be leading the operations of Asterisk PBX systems and Kamailio SIP servers, driving automation frameworks (using Python, Bash, Go) for deployments, scaling, and failover, and mentoring junior and mid-level SREs on Linux operations, VoIP troubleshooting, and scripting best practices. In addition, you will be tasked with defining and enforcing Service Level Objectives (SLOs), Service Level Agreements (SLAs), and Error Budgets across VoIP, messaging, and mobile services. You will lead incident management war rooms, conduct Root Cause Analyses (RCAs), disaster recovery, and compliance initiatives. Furthermore, collaboration with Dev, Security, and Product teams to evolve reliability strategy will be a key part of your role. To excel in this position, you should have 10+ years of experience in SRE, DevOps, Development, Production Engineering, or Linux Systems roles. Expertise in Linux performance tuning, server security, and automation is essential. Deep experience with Asterisk, Kamailio, and SIP call flow troubleshooting, a strong understanding of mobile network behavior (latency, jitter, SMS routing), and mastery of Kubernetes, Docker, Terraform, Helm, and related cloud-native tools are required. Advanced scripting skills in Python, Go, or Bash, as well as experience in disaster recovery, multi-region failover, and compliance, are also necessary for this role.,

Posted 6 days ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies