Site Reliability Engineer III

3 - 7 years

8 - 12 Lacs

Hyderabad

Posted:1 week ago| Platform: Naukri logo

Apply

Skills Required

continuous integration ci/cd containerization troubleshooting it infrastructure kubernetes c reliability sre microsoft azure iac disaster recovery site reliability engineering docker ansible incident management jenkins terraform gitlab aws

Work Mode

Work from Office

Job Type

Full Time

Job Description

ABOUT AMGEN ? Amgen harnesses the best of biology and technology to fight the world’s toughest diseases, and make people’s lives easier, fuller and longer. We discover, develop, manufacture and deliver innovative medicines to help millions of patients. Amgen helped establish the biotechnology industry more than 40 years ago and remains on the cutting-edge of innovation, using technology and human genetic data to push beyond what’s known today. ABOUT THE ROLE Let’s do this. Let’s change the world. ? We are looking for a Site Reliability Engineer ing Manager (SRE 3 ) to work on the performance optimization, standardization, and automation of Amgen’s critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence through automation, incident response, and proactive performance tuning, while also reducing infrastructure costs. You will work closely with cross-functional teams to establish best practices for service availability, efficiency, and cost control. Roles & Responsibilities Talent Management & Team Leadership Lead, mentor, motivate and manage a high-performing engineering team to deliver exceptional result s fostering a culture of innovation and best practices System Reliability, Performance Optimization & Cost Reduction Ensure the reliability, scalability, and performance of Amgen’s infrastructure, platforms, and applications. Proactively identify and resolve performance bottlenecks and implement long-term fixes. Continuously evaluate system design and usage to identify opportunities for cost optimization, ensuring infrastructure efficiency without compromising reliability. Automation & Infrastructure as Code ( IaC ) Drive the adoption of automation and Infrastructure as Code ( IaC ) across the organization to streamline operations, minimize manual interventions, and enhance scalability. Implement tools and frameworks (such as Terraform, Ansible, or Kubernetes) that increase efficiency and reduce infrastructure costs through optimized resource utilization . Standardization of Processes & Tools Establish standardized operational processes, tools, and frameworks across Amgen’s technology stack to ensure consistency, maintainability, and best-in-class reliability practices. Champion the use of industry standards to optimize performance and increase operational efficiency. Monitoring, Incident Management & Continuous Improvement Implement and maintain comprehensive monitoring, alerting, and logging systems to detect issues early and ensure rapid incident response. Lead the incident management process to minimize downtime, conduct root cause analysis, and implement preventive measures to avoid future occurrences. Foster a culture of continuous improvement by leveraging data from incidents and performance monitoring. Collaboration & Cross-Functional Leadership Partner with software engineering, and IT teams to integrate reliability, performance optimization, and cost-saving strategies throughout the development lifecycle. Act as a SME for SRE principles and advocate for best practices for assigned Projects. Capacity Planning & Disaster Recovery E xecute capacity planning processes to support future growth, performance, and cost management. Maintain disaster recovery strategies to ensure system reliability and minimize downtime in the event of failures. Must-Have Skills: Experienced with AWS / Azure C loud S ervices Proficient in CI/CD (Jenkins/Gitlab) , Observability, IAC , Gitops , C ontainerization (Docker) and orchestration tools (Kubernetes) Ability to learn new technologies quickly. Strong problem-solving and analytical skills. Excellent communication and teamwork skills. Should have hands-on and primarily should do coding and hands-on given technologies Lead & Mentor the other team members Good-to-Have Skills: Knowledge of cloud-native technologies and strategies for cost optimization in multi-cloud environments. Familiarity with distributed systems, databases, and large-scale system architectures. Databricks Knowledge/Exposure is good to have (need to upskill if hired) Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work w ith global, virtual teams and manage multiple priorities successfully. Basic Qualifications Bachelor’s / Masters degree in Computer Science, Engineering, or related field. 9 - 1 3 years of experience in IT infrastructure, with at least 5 + years in Site Reliability Engineerin g or related fields. EQUAL OPPORTUNITY STATEMENT ? Amgen is an Equal Opportunity employer and will consider you without regard to your race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability status. We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation .

Mock Interview

Practice Video Interview with JobPe AI

Start Continuous Integration Interview Now
Amgen Inc
Amgen Inc

Biotechnology

Thousand Oaks

22,000 Employees

868 Jobs

    Key People

  • Robert A. Bradway

    Chairman & CEO
  • Murray Aitken

    Senior Vice President

RecommendedJobs for You