Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Hyderabad
>
Highradius
>
Senior / Principal / Lead - Site Reliability Engineer (Product Engg)

Senior / Principal / Lead - Site Reliability Engineer (Product Engg)

Highradius

6 - 11 years

20 - 35 Lacs

Hyderabad

Posted:3 months ago| Platform:

Apply

Skills Required

Linux Administration Change Management VMware Cloud Operations Cloud Infra Aws Cloud Services Cloud Infrastructure Release Management Jenkins Terraform Kubernetes Administration Ansible Infrastructure Management Os Patching Incident Management ArgoCD Python

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Summary:

reliability, security, and scalability

Responsibilities

Cloud Infrastructure Architecture and Management:
Design, build, and maintain resilient cloud infrastructure solutions to support the development and deployment of scalable and reliable applications. This includes managing and optimizing cloud platforms for high availability, performance, and cost efficiency.
Enhancing Service Reliability:
Lead reliability best practices by establishing and managing monitoring and alerting systems to proactively detect and respond to anomalies and performance issues. Utilize SLI, SLO, and SLA concepts to measure and improve reliability. Identify and resolve potential bottlenecks and areas for enhancement.
Driving Automation and Efficiency:
Contribute to the automation, provisioning, and standardization of infrastructure resources and system configurations. Identify and implement automation for repetitive tasks to significantly reduce operational overhead. Develop Standard Operating Procedures (SOPs) and automate workflows using tools like Rundeck or Jenkins.
Incident Response and Resolution:
Participate in and help resolve major incidents, conduct thorough root cause analyses, and implement permanent solutions. Effectively manage incidents within the production environment using a systematic problem-solving approach.
Collaboration and Innovation:
Work closely with diverse stakeholders and cross-functional teams, including software engineers, to integrate cloud solutions, gather requirements, and execute Proof of Concepts (POCs). Foster strong collaboration and communication. Guide designs and processes with a focus on resilience and minimizing manual effort. Promote the adoption of common tooling and components, and implement software and tools to enhance resilience and automate operations. Be open to adopting new tools and approaches as needed.

Requirements

Experience:

Role:

Education:

Technology Stack:

Infrastructure Management:
Proven proficiency in on-premises hosting and virtualization platforms (VMware, Hyper-V, or KVM). Solid understanding of storage internals (NAS, SAN, EFS, NFS) and protocols (FTP, SFTP, SMTP, NTP, DNS, DHCP). Experience with networking and firewall technologies. Strong hands-on experience with Linux internals and operating systems (RHEL, CentOS, Rocky Linux). Experience with Windows operating systems to support varied environments.
Service Reliability Concepts:
Good understanding of SLI, SLO, SLA and error budgeting

Other Mandatory Requirements:

More Jobs at Highradius

Site Reliability Engineer (Cloud & Infrastructure)

Hyderabad

3 - 8 yrs

INR 17 - 32 Lacs

Business Analyst

Hyderabad

2 - 6 yrs

INR 5 - 13 Lacs

Techno Functional Implementation Consultant - Product

Hyderabad

2 - 6 yrs

INR 5 - 13 Lacs

Financial Technology Advisor

Hyderabad

2 - 5 yrs

INR 8 - 16 Lacs

Java Developer

Hyderabad / Secunderabad, Telangana, Telangana, India

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.