Site Reliability Engineer III

3 - 8 years

5 - 10 Lacs

Posted:6 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Reliability Support Specialists at 6sense are instrumental figures of our Reliability team and work with Engineering teams to help diagnose and fix issues to ensure our services and infrastructure are fast, stable, and scalable. The Reliability team focuses on the automation, integration, operation, and overall improvement of our monitoring, logging, and alerting services to ensure we can deliver product quickly, safely, and reliably.

Responsibilities :

  • Own our monitoring, logging, and alerting tools used by the overall Software Engineering team in order to ensure we are meeting reliability requirements
  • Learning and adopting technologies that may aide in solving our challenges.
  • Support the overall Software Engineering team to monitor/alert on any issues they may encounter.
  • Help respond to service issues and determine how to automatically alert the responsible parties along with context in order to make the service-owner a self-sufficient first-responder
  • First-responder to issues with shared infrastructure and escalate to other team members as necessary
  • Work with other teams to get automatic resolutions in place to alleviate need for human response
  • Participate in on-call rotations to monitor platform/infrastructure issues.

Minimum Required Qualifications :

  • 3+ years in a reliability or technical support-related role
  • Proficient with ANSI SQL (reading and writing queries)
  • Must have strong problem-solving analytical skills and the ability to self-manage
  • Experience with monitoring REST APIs and web services
  • Experience with high-availability
  • Experience with leveraging and configuring observability systems such as Datadog, Grafana, Grafana Loki, Promethus, Sumo Logic.
  • Experience with monitoring relational databases such as MySQL, Aurora/RDS MySQL, PostgreSQL, etc.

Bonus Requirements :

  • 3+ years of experience with Linux/Unix system administration
  • Experience with monitoring Hadoop ecosystems (e.g. Hadoop, Hive, Presto)
  • Experience monitoring and analyzing services/applications in service-oriented architecture at the network/server level as well as in containerized space (such as Kubernetes and Docker) #LI-remote
 

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Slintel logo
Slintel

Information Technology and Services

New York

RecommendedJobs for You