Tekonika - Service Reliability Engineer - Production Systems

9 - 12 years

0 Lacs

Posted:3 weeks ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title : Service Reliability EngineerLocation : Bangalore (Hybrid)Experience : 9-12 YearsMode of Working : Hybrid (Office-Based)

About The Role

We are looking for a highly skilled and experienced Lead Service Reliability Engineer (SRE) to join our growing team. In this role, you will be responsible for ensuring the reliability, performance, and scalability of our production systems. You'll play a key part in incident response, infrastructure automation, and driving operational excellence across the organization.

Key Responsibilities

  • Handle and lead the response to production incidents with calm and clarity.
  • Communicate effectively with internal teams and clients during outages.
  • Draft detailed Root Cause Analysis (RCA) documents post-incident.
  • Monitor and improve the performance, stability, and health of production systems.
  • Proactively identify and resolve system issues by analyzing metrics and logs.
  • Scale infrastructure to meet business objectives while adhering to SLA/SLO targets.
  • Perform upgrades and maintenance on EKS clusters.
  • Administer Kubernetes clusters and ensure optimal configuration and performance.
  • Automate infrastructure using Terraform and Terragrunt (IaC).
  • Integrate observability and security checks into CI/CD pipelines.

Required Skills & Qualifications

  • Proven experience in managing production environments and incident handling.
  • Hands-on experience with incident management tools (e.g., PagerDuty, ServiceNow).
  • Strong expertise in observability tools (e.g., Datadog).
  • Proficient in scripting/programming using Python or similar languages.
  • Solid understanding and administration of Kubernetes.
  • Expertise in Infrastructure as Code (IaC) using Terraform and Terragrunt.
  • In-depth experience with AWS, including :
  • IAM (with cross-account role experience preferred)
  • EC2, VPC, S3
  • Networking (VPC, Transit Gateway, NACLs, Security Groups)
  • Experience with EKS for cluster management and upgrades.
  • Familiarity with CI/CD pipelines and DevOps best practices.

Preferred/Bonus Skills

  • Exposure to infrastructure security and best practices :
  • IAM least privilege, encryption, secrets management, etc.
  • Experience working in Agile/Scrum environments.

What We Offer

  • Opportunity to work on high-impact, production-critical systems.
  • Collaborative and inclusive work culture.
  • Competitive compensation and benefits.
  • Learning and growth opportunities in cloud-native technologies and DevOps practices.
Join us and lead the charge in building scalable, reliable, and secure systems that power our mission.
(ref:hirist.tech)

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You