SRE Analytics Analyst

4 - 9 years

12 - 22 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role & responsibilities

Performance & Capacity Planning

  • Analyze system performance, resource utilization, and capacity trends to prevent bottlenecks.
  • Provide recommendations for scaling infrastructure based on analytics.

Automation & Process Improvements

  • Contribute to alert tuning, reducing false positives and improving signal-to-noise ratio.
  • Assist in developing scripts and automation (Python, Shell, PowerShell, or equivalent) for repetitive operational tasks.

Monitoring & Observability

  • Manage, configure, and optimize monitoring dashboards, alerts, and metrics (LogicMonitor, Datadog, Prometheus, Grafana, Splunk, etc.).
  • Ensure proactive detection of system anomalies through trend analysis and predictive monitoring.
  • Maintain SLOs, SLIs, and SLAs to ensure system reliability and performance.

Incident Management & Analytics

  • Act as L2 support for production incidents, ensuring timely resolution or escalation to L3/SRE engineers.
  • Perform root cause analysis (RCA) using logs, metrics, and traces, and provide detailed incident reports.
  • Identify recurring issues through data analytics and recommend automation or process improvements.

Collaboration & Reporting

  • Partner with SRE, DevOps, Application, and Infrastructure teams to resolve systemic issues.
  • Generate regular reliability and performance reports for stakeholders.
  • Share insights on service reliability, incidents, and optimizations with leadership and engineering teams.

Required Skills

  • Strong experience with observability tools (LogicMonitor, Datadog, Grafana, Prometheus, ELK Stack, Splunk, etc.)
  • Experience with time-series analysis (Prometheus, InfluxDB).
  • Proficiency in scripting for monitoring/incident automation (Python, Shell, PowerShell, or similar)
  • Experience in

    load testing tools

    for trend analysis
  • Knowledge of trend and anomaly detection for proactive alerting
  • Strong skills in data analysis with visualization tools (Grafana, Kibana, Power BI, Tableau).
  • Proficiency in log analysis, metrics, and tracing frameworks.
  • Working knowledge of cloud platforms (AWS, Azure, or GCP).
  • Familiarity with container technologies (Docker, Kubernetes) and microservices monitoring.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Movate Technologies logo
Movate Technologies

Information Technology and Services

Fayetteville

RecommendedJobs for You