Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Hyderabad
>
Cyberark
>
Associate Site Reliability Engineer

Associate Site Reliability Engineer

Cyberark

2 - 3 years

19 - 20 Lacs

Hyderabad

Posted:3 months ago| Platform:

Apply

Skills Required

Performance tuning remediation github orchestration Powershell Analytical Shell scripting Incident management Monitoring Python

Work Mode

Work from Office

Job Type

Full Time

Job Description

About the Role:

We are seeking a highly skilled Associate Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure. You will collaborate closely with development, operations, and other teams to implement and maintain efficient and resilient systems.

We are the SRE Frontline Team of CyberArk. Our group ensures the health and performance of system and services is optimal using monitoring tools and dashboards. Our goal is to maintain a scalable, fault-tolerant, high-load, distributed system. We are searching for an outstanding SRE expert who is responsible for driving and improving the Incident Management processes and goals for Site Reliability teams, with a focus on triaging and ensuring the reliability, performance, and scalability of CyberArk s SaaS services and underlying AWS infrastructure. This role involves a combination of technical expertise, documentation, and collaboration to meet the organizations reliability and availability goals.

Responsibilities:

Incident Management, Monitoring and Alerting
: Drive incident response processes and troubleshoot complex issues, ensuring timely resolution of outages. Establish monitoring, logging, and alerting best practices using tools like Datadog, Site24x7 etc
Tooling and Automation
: Build essential tooling to improve reliability of systems and automated remediation of issues.
Be a part of the on-call rotation 365x24x7.
SOP Documentation:
Create and maintain documentation for infrastructure, processes, and incident management protocols.
Understanding of
Infrastructure as Code (IaC) tools such as
Terraform
and
Ansible
to automate the provisioning, configuration, and deployment processes.
Attend all training programs and complete all tasks set by the supervisor and assist other trainees wherever possible.
Cloud Platform Expertise:
Hands-on with AWS cloud services, including EC2, S3, VPC, RDS, EKS, ECS, CF and more.
CI/CD Pipelines:
Fair understanding of CI/CD pipelines using tools like Jenkins.
Monitoring and Alerting:
Hands-on experience with monitoring and alerting tools like ELK, Datadog, CloudWatch, Grafana etc to proactively identify and resolve issues.
Performance Tuning
: Continuously optimize system performance, identify bottlenecks, and implement strategies to improve scalability and efficiency.
Cost Optimization:
Identify and implement strategies to reduce cloud costs while maintaining performance and reliability.
Security Best Practices:
Adhere to security best practices and implement measures to protect infrastructure and data from vulnerabilities and threats.
Collaboration and Communication:
Work effectively with cross-functional teams to understand business requirements and provide technical guidance.

#IL-MP01

Qualifications

Required Skills and Experience:

2-3 years of experience as a Site Reliability
Strong proficiency in AWS cloud services like EC2, S3, VPC, RDS, EKS, ECS, CloudFormation and more. AWS Certification helps.
Good Logical, Analytical and Problem-solving skills.
Strong communication skills and Ability to work in shifts (24x7).
Strong scripting skills (Python, PowerShell, CDK, Shell scripting).
Understanding of infrastructure as code tools (Terraform, Ansible) and AWX Tower for Ansible automation.
Knowledge of containerization (Docker) and orchestration platforms (Kubernetes).
Expertise in CI/CD pipelines and automation tools (Jenkins, GitHub).
Exposure to monitoring and alerting tools (CloudWatch, Datadog, ELK, Grafana, Site24x7).
Documenting SOP and RCAs.
Understanding of security best practices and compliance standards. Security Certification is a plus.

More Jobs at Cyberark

Software Engineer - FullStack

Hyderabad

2 - 6 yrs

INR 8 - 12 Lacs

Senior Software Engineer

Hyderabad

5 - 10 yrs

INR 15 - 27 Lacs

Customer Success Engineer (Mumbai based)

Mumbai

1 - 4 yrs

INR 3 - 6 Lacs

Senior Software Engineer (C++)

Hyderabad

6.0 - 12.0 yrs

INR 8 - 14 Lacs

Software Architect - Mobile

Hyderabad

4.0 - 10.0 yrs

INR 6 - 12 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Cyberark

Security and Investigations

Kochi Kerala

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

Associate Site Reliability Engineer

Experience & Salary

Skills Required

Work Mode

Job Type

Job Description

About the Role:

Responsibilities:

Incident Management, Monitoring and Alerting

Tooling and Automation

SOP Documentation:

Understanding of

Terraform

Ansible

Cloud Platform Expertise:

CI/CD Pipelines:

Monitoring and Alerting:

Performance Tuning

Cost Optimization:

Security Best Practices:

Collaboration and Communication:

Required Skills and Experience:

More Jobs at Cyberark