Senior Site Reliability Engineer

CyberArk

5 - 8 years

5 - 8 Lacs

hyderabad telangana india

Posted:3 days ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Key Responsibilities:

Lead
incident management
, monitoring, and alerting processes to ensure timely detection and resolution of production issues.
Ensure
reliability, availability, and performance
of systems by defining and maintaining SLIs, SLOs, and SLAs.
Design and implement
fault-tolerant, scalable architectures
to minimize downtime and improve resiliency.
Develop
automation and tooling
for monitoring, incident remediation, and infrastructure management.
Participate in a
24x7 on-call rotation
to manage production incidents and maintain system uptime.
Create and maintain
SOPs and technical documentation
for processes, tools, and incident management protocols.
Implement and manage
Infrastructure as Code (IaC)
using tools such as
Terraform
and
Ansible
to automate provisioning and deployments.
Work with
cloud platforms
primarily AWS (EC2, S3, VPC, RDS, EKS, ECS, CloudWatch, CloudFormation)to support scalable system operations.
Integrate and manage
CI/CD pipelines
using tools like Jenkins to enable seamless deployments.
Utilize
monitoring and alerting tools
(Datadog, Site24x7, Grafana, CloudWatch) to proactively identify issues.
Conduct
performance tuning and optimization
, addressing bottlenecks and improving efficiency.
Drive
cost optimization strategies
while maintaining performance and reliability standards.
Adhere to
security best practices
and ensure infrastructure compliance with organizational standards.
Collaborate with development, product, and security teams to enhance system reliability and service delivery.
Mentor junior engineers and promote a culture of reliability engineering across the organization.

Qualifications:

58 years of experience in
Site Reliability Engineering, DevOps, or Cloud Infrastructure
roles.
Strong hands-on expertise with
AWS
(experience with GCP or Azure is a plus).
Proficiency in
Infrastructure as Code (IaC)
tools such as
Terraform
and
Ansible
.
Experience with
monitoring and alerting tools
including Datadog, Site24x7, Grafana, and CloudWatch.
Solid understanding of
CI/CD tools
such as Jenkins.
Proven ability in
incident management, root cause analysis
, and implementing long-term reliability improvements.
Familiarity with
automation scripting
(Python, Bash, or Shell scripting preferred).
Knowledge of
security best practices
,
networking
, and
cloud cost management
.
Excellent problem-solving, analytical, and collaboration skills.
AWS certification or equivalent cloud certification is an advantage.

More Jobs at CyberArk

Software Engineer - Platform Development

Hyderabad, Telangana, India

3 - 3 yrs

Salary: Not disclosed

R&D Sr. Product Owner

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Security Architect

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Security Consultant - India (based in Mumbai)

Mumbai Metropolitan Region

7 - 10 yrs

Salary: Not disclosed

Software Architect

Hyderabad, Telangana, India

8.0 - 8.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.