Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in hyderabad
>
Amgen Inc
>
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Amgen Inc

8 - 10 years

13 - 17 Lacs

hyderabad

Posted:4 days ago| Platform:

Apply

Skills Required

continuous integration cloud services sre ci/cd ops kubernetes python iac site reliability engineering docker ansible data bricks containerization aws cloud incident management jenkins terraform gitlab aws it infrastructure

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence through automation, incident response, and proactive performance tuning, while also reducing infrastructure costs. You will work closely with cross-functional teams to establish best practices for service availability, efficiency, and cost control.

Roles & Responsibilities:

System Reliability, Performance Optimization & Cost Reduction: Ensure the reliability, scalability, and performance of Amgens infrastructure, platforms, and applications. Proactively identify and resolve performance bottlenecks and implement long-term fixes. Continuously evaluate system design and usage to identify opportunities for cost optimization, ensuring infrastructure efficiency without compromising reliability.
Automation & Infrastructure as Code (IaC): Drive the adoption of automation and Infrastructure as Code (IaC) across the organization to streamline operations, minimize manual interventions, and enhance scalability. Implement tools and frameworks (such as Terraform, Ansible, or Kubernetes) that increase efficiency and reduce infrastructure costs through optimized resource utilization.
Standardization of Processes & Tools: Establish standardized operational processes, tools, and frameworks across Amgens technology stack to ensure consistency, maintainability, and best-in-class reliability practices. Champion the use of industry standards to optimize performance and increase operational efficiency.
Monitoring, Incident Management & Continuous Improvement: Implement and maintain comprehensive monitoring, alerting, and logging systems to detect issues early and ensure rapid incident response. Lead the incident management process to minimize downtime, conduct root cause analysis, and implement preventive measures to avoid future occurrences. Foster a culture of continuous improvement by leveraging data from incidents and performance monitoring.
Collaboration & Cross-Functional Leadership: Partner with software engineering, and IT teams to integrate reliability, performance optimization, and cost-saving strategies throughout the development lifecycle. Act as a SME for SRE principles and advocate for best practices for assigned Projects.
Capacity Planning & Disaster Recovery: Execute capacity planning processes to support future growth, performance, and cost management. Maintain disaster recovery strategies to ensure system reliability and minimize downtime in the event of failures.

Basic Qualifications:

Masters degree and 8 to 10 years of IT infrastructure, Site Reliability Engineering or related fields experience OR
Bachelors degree and 10 to 14 years ofIT infrastructure, Site Reliability Engineering or related fields experience OR
Diploma and 14 to 18 years ofIT infrastructure, Site Reliability Engineering or related fields experience.

Must-Have Skills:

Extensively experienced with AWS Cloud Services
Proficient in CI/CD (Jenkins/Gitlab), Observability, IAC, Gitops etc
Experience with containerization (Docker) and orchestration tools (Kubernetes) to optimize resource usage and improve scalability.
Identify and specify SRE tasks
Strong Hands-on SRE tasks and automate using Python/ Scripting language
Well Versed with FinOps, Infra-Ops, & Platform Operations.
Ability to learn new technologies quickly. Strong problem-solving and analytical skills. Excellent communication and teamwork skills.
Leadership skills are mandatory to lead a team of 4 to 5 to guide on Technical blockers

Good-to-Have Skills:

Knowledge of cloud-native technologies and strategies for cost optimization in multi-cloud environments.
Familiarity with distributed systems, databases, and large-scale system architectures.
Bachelors degree in computer science and engineering preferred, other Engineering field is considered
Databricks Knowledge/Exposure is good to have (need to upskill if hired)

Soft Skills:

Ability to foster a collaborative and innovative work environment.
Strong problem-solving abilities and attention to detail.
High degree of initiative and self-motivation.

We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation

More Jobs at Amgen Inc

Strategic Planning & Operations Senior Manager

Hyderabad

5 - 10 yrs

INR 7 - 12 Lacs

IT Project Manager / SAFe Scrum Master

Hyderabad

5 - 6 yrs

INR 7 - 10 Lacs

CSAR Manager - SAS Edit Check Programmer

Hyderabad

9 - 12 yrs

INR 20 - 27 Lacs

Global HEOR Economic Modeling Leader

Hyderabad

4 - 8 yrs

INR 15 - 18 Lacs

Cyber and 3rd party risk manager

Hyderabad

6 - 9 yrs

INR 11 - 15 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Amgen Inc

Biotechnology

Thousand Oaks

Login to

Please Verify Your Phone or Email

Confirm Action

Principal Site Reliability Engineer