Site Reliability Engineer

4 - 8 years

5 - 6 Lacs

Posted:16 hours ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

Role and Responsibilities:

  • Participate in an on-call rotation for incident response and implement proactive measures to prevent incidents.
  • Develop monitoring alerts and incident response processes to ensure high availability and reliability.
  • Document actions taken during incidents and create automated solutions to improve incident response.
  • Collaborate with the engineering team as an expert in reliability, performance, and efficiency to support ongoing projects.
  • Consistently deliver high-quality managed services, ensuring optimal uptime and scalability of infrastructure, applications, and cloud services.
  • Automate the detection and resolution of recurring issues to enhance system stability.
  • Build tools and automation frameworks to eliminate repetitive tasks and prevent incident occurrence.
  • Continuously improve engineering, operational processes, and team practices to enhance efficiency and productivity.
  • Demonstrate strong programming skills and a deep understanding of systems to support the reliability and scalability of services.
  • Foster a culture of continuous improvement by promoting process changes and best practices.
  • Engage in continuous learning to expand skills through experimentation or training.

Soft Skills:

  • Ability to work asynchronously and independently.
  • Strong collaboration skills and willingness to work as part of a team.
  • Excellent problem-solving skills with the ability to think clearly under pressure.
  • Strong analytical and management skills.
  • Effective communication and documentation skills.

Qualifications:

  • Bachelor's or Graduate degree in Computer Engineering, Computer Science, Engineering, Information Systems Management, or equivalent experience.
  • Experience with Monitoring/Observability/Log tools such as AWS CloudWatch, Datadog, Prometheus/Grafana, and ELK.
  • Proficiency with Public Cloud platforms, LINUX/UNIX environments, and programming languages such as Java, Python, or Go.
  • Familiarity with Agile methodologies, SaaS environments, RDBMS, NoSQL databases, Cloud Architecture, and Frontend/Backend Systems and tools.
  • Comfortable with scripting and debugging production systems and services.
  • Strong collaboration skills with a mindset for continuous improvement.
  • Expertise in scalability and root cause analysis exercises

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Netradyne logo
Netradyne

Software Development

San Diego California

RecommendedJobs for You

Bengaluru, Karnataka, India