Cloud Site Reliability Engineer

3 years

0 Lacs

Posted:15 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

How will you make an impact?


  • Act as part of a team of SRE’s that act as the ‘gatekeepers’ of production and actively manage the work backlog and develop reliability improvements.
  • Lead investigations into root cause outages, performance, and cost issues.
  • Lead initiatives to develop the automation of low-value tasks balanced against project delivery demands.
  • You will provide technical leadership and to wider Cloud Operations and Support teams along with providing oversight to the products and services they support.
  • Develop and configure monitoring dashboards and alerts in tools like Grafana and Azure Monitor.
  • Installation and configuration of Observability Platform including tools like Grafana, Prometheus, Azure Monitor, Open telemetry etc.
  • Developing bicep modules for monitoring infrastructure and deploy it.
  • Developing and configuring CI/CD pipelines in Azure Devops for deploying monitoring infrastructure and monitoring objects


Have you got what it takes?


  • Must have 3+ years of experience in Site Reliability Engineering
  • Excellent technical, analytical and troubleshooting skills
  • Experience and in-depth knowledge of databases and data handling (MS-SQL, Elasticsearch, YML, JSON, XML)
  • Significant experience in programming or advanced scripting (C#, PowerShell etc.)
  • Experience with infrastructure/configuration as code and version control (ARM, BICEP, Git)
  • Experience managing monitoring, alerting and dashboarding platforms (Azure Monitor, Prometheus, Grafana, Elasticsearch)
  • Demonstrable experience of supporting live cloud services and platforms
  • Production experience with Kubernetes and containerization
  • Implementation and support of service level objectives (SLOs)
  • Exposure to commercial cloud providers (Ideally Azure, others considered)
  • Exposure to Azure DevOps pipelines is desirable (CI/CD)
  • Exposure to test frameworks is desirable (NUnit, Jasmine, Selenium)
  • Efficient, effective, and respectful communication skills both with customers and within internal departments. Including,
  • Good listener, able to identify and validate assumptions.
  • Able to use effective questioning to confirm understanding of a customer problem and then provide help to solve it.
  • Methodical troubleshooting, technical skill and attention to detail used in diagnosing problems and reproducing issues in a local environment.
  • Multi-tasking and time-management to priorities and switch between varied tasks.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You