SRE L2 Support Role:
Focus on maintaining and improving the reliability, availability, and performance of AWS-based infrastructure and applications.
Incident Management:
Handle and resolve L2 incidents related to AWS services (EC2, RDS, S3, Lambda, EKS, etc.), perform root cause analysis, and communicate to customers during outages or SLA breaches.
Monitoring & Optimization:
Proactively monitor infrastructure and application health in AWS, set up and fine-tune AWS monitoring and observability tools (e.g., CloudWatch, CloudTrail), create alarms, dashboards, and reports.
Troubleshooting AWS Services:
Resolve issues related to EC2 instances, Autoscaling Groups, Load Balancers (ELB/ALB/NLB), Amazon ECS, EKS, and container workloads.
Log Management:
Manage and analyze logs using AWS CloudWatch Logs, CloudTrail, and third-party solutions like ELK Stack, Datadog, Splunk.
Disaster Recovery & Backups:
Monitor AWS Backup jobs, ensure regular backups for critical infrastructure, validate DR plans, and participate in recovery testing exercises.
Automation & Scripting:
Contribute to automation of repetitive tasks using scripts and support incident recovery processes.
Documentation & Knowledge Sharing:
Create and maintain operational runbooks, SOPs, and knowledge base articles for common AWS issues.
Collaboration:
Work effectively across teams, shift ownership as required, and communicate with stakeholders during incidents.

You'd Describe Yourself As:

An experienced professional with
6 to 9 years
of relevant experience in
SRE
,
DevOps
, or
Cloud Infrastructure Support
with strong hands-on expertise in
AWS services
.
Proficient in
monitoring tools
like Prometheus, Datadog, and familiar with
cloud platforms
(AWS, Azure, GCP).
Knowledgeable in
Linux/Unix operating systems
and
basic scripting skills
(e.g., Python, GitLab actions).
Familiar with
container orchestration
(Kubernetes, Docker, Helmcharts),
CI/CD pipelines
, and
GitOps workflows
(e.g., ArgoCD for automated deployments).
Strong analytical skills to resolve
production incidents
and a basic understanding of
networking concepts
(DNS, Load Balancers, Firewalls).
Experienced with
alerting systems
(e.g., PagerDuty),
incident tracking tools
(e.g., JIRA, ServiceNow), and ability to handle high-pressure environments.
A
proactive problem-solver
with a strong sense of urgency and excellent
organizational skills
to prioritize tasks effectively.
Able to work as a
teammate
, collaborating across teams and owning tasks as needed.

Preferred Certifications:

AWS Certified SysOps Administrator Associate
AWS Certified Solutions Architect Associate
AWS Certified DevOps Engineer Professional

More Jobs at University Of Cambridge

Data Privacy Technologist

Bengaluru, Karnataka, India

2.0 - 5.0 yrs

INR 2 - 5 Lacs

Lead Engineer Verification & Validation

Bengaluru, Karnataka, India

5.0 - 8.0 yrs

INR 5 - 8 Lacs

Senior Test Engineer

Pune, Maharashtra, India

8.0 - 13.0 yrs

INR 8 - 13 Lacs

Project Management, Strategic Sourcing, Procurement Operations, Costing, Price Negotiation

Thane, Maharashtra, India

2.0 - 7.0 yrs

INR 2 - 7 Lacs

Senior Software Engineer Team Lead Mendix

Pune, Maharashtra, India

7.0 - 9.0 yrs

INR 7 - 9 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.