Senior Site Reliability Engineer - Tooling & Platform (Go & Terraform)

4 - 9 years

17 - 20 Lacs

Posted:3 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Position Overview:
We are seeking an experienced and technically influential Senior Site Reliability Engineer to join our Cloud Tooling and Pipelines team. This pivotal team drives the strategy, development, and operation of our core Continuous Delivery (CD) platform (leveraging Spinnaker and custom tooling), Infrastructure as Code (IaC) executions (primarily Terraform), and a suite of supporting microservices. These systems are critical for managing our extensive resource footprint across AWS ECS and EKS.
As a Senior Site Reliability Engineer , you will be a key technical leader, setting the architectural vision and driving the implementation of scalable and reliable solutions for automating infrastructure and application deployments. Your deep expertise in both software engineering and cloud operations will be essential in building and maintaining our critical tooling, enhancing our capabilities in infrastructure provisioning, vulnerability management, and IaC deployments. You will also play a vital role in mentoring other engineers and influencing the team's technical roadmap. If you have a strong passion for both building software and managing infrastructure at scale and are driven to solve complex operational challenges through automation, we encourage you to apply.
Key Responsibilities:
  • Maintain Platform Strategy: Be a core technical leader in shaping the strategic direction and future evolution of Okta's CD platform (including Spinnaker, Terraform and custom tools) and related infrastructure automation.
  • Architect End-to-End Automation: Lead the design and architecture of robust CD pipelines, Terraform-based IaC workflows, and application deployment processes, ensuring scalability, reliability, and security.
  • Design, Build, and Maintain Critical Tooling: Architect, build, maintain, and deploy sophisticated tools and microservices that empower Okta's engineering teams to provision infrastructure, execute production changes, and deploy code with high reliability and efficiency.
  • Develop High-Quality Automation Software: Design and build scalable and reliable microservices (potentially in Java, Python, or Go) with a strong focus on automation, operational excellence, and self-service capabilities.
  • Drive Cross-Functional Collaboration: Partner closely with Software Engineering, SREs, and Product teams to proactively identify operational bottlenecks and manual processes, leading the design and implementation of scalable and reliable automation solutions.
  • Champion DevOps Best Practices: Research and advocate for the adoption of industry best practices in infrastructure automation, continuous delivery, and orchestration to drive innovation and continuous improvement.
  • Integrate Security: Apply and promote security best practices throughout the development lifecycle of our tooling and infrastructure automation to ensure a secure and compliant operational environment.
  • Deliver Self-Service Capabilities: Proactively identify opportunities to create self-service automation for infrastructure provisioning, application deployments, and other operational tasks, reducing manual effort and improving developer velocity and onboarding.
  • Provide Technical Guidance and Mentorship: Serve as a technical mentor and role model for other engineers on the team, fostering a culture of collaboration, innovation, and technical excellence.

  • Required Qualifications:
  • 4+ years of combined experience in Software Engineering and Site Reliability Engineering roles.
  • 3+ years of software development experience in Go , or similar backend languages, with a focus on building scalable and reliable applications.
  • 4+ years of hands-on experience automating and managing large-scale production infrastructure and services in AWS, GCP , or similar cloud environments.
  • Deep understanding and practical experience with containerization and orchestration technologies such as Kubernetes and ECS .
  • Strong working knowledge of Continuous Integration/Continuous Delivery (CI/CD) platforms , with experience in Spinnaker and a strong interest in exploring other industry-standard tools.
  • Solid understanding of Infrastructure-as-Code (IaC) principles and experience with tools such as Terraform .
  • Proficient in using Docker and supporting infrastructure, with strong Linux and networking fundamentals .
  • Experience with database technologies (MySQL, MongoDB, etc.) in the context of application development and operational management.
  • A strong passion for automation and solving complex operational challenges through software solutions.
  • Excellent communication, collaboration, and leadership skills.
  • Bachelors degree in Computer Science or a related field, or equivalent professional experience.
  • Mock Interview

    Practice Video Interview with JobPe AI

    Start Python Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now

    RecommendedJobs for You