Site Reliability Engineer II - HashiCorp Platform DR

3 - 8 years

13 - 17 Lacs

Posted:2 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Your role and responsibilities
At HashiCorp, we build the Infrastructure Cloud to help enterprises take a unified approach to reliability, disaster recovery, and operational resilience across cloud and enterprise environments. Our team ensures HashiCorps products meet the highest standards of availability, performance, and fault tolerance, enabling organizations to operate at scale with confidence.As a Engineer on the HashiCorp Disaster Recovery team, you will help build and own solutions that strengthen disaster recovery (DR) governance and reliability across our cloud products. Your work will focus on system resilience, operational readiness, and high availability, making a real impact on the reliability of our platform.We deliver the Infrastructure Cloud through our enterprise-grade SaaS platform, HCP, as well as through self-managed, on-premises solutions. Across our platform engineering teams, were looking for great engineers to help us build the future of reliable, scalable infrastructure! What youll do (responsibilities) 

  • Design, implement, and optimize disaster recovery (DR) solutions to enhance system resilience, ensuring high availability and fault tolerance across cloud products.
  • Develop and execute comprehensive DR testing strategies, identifying bottlenecks and failure points that impact Recovery Point (RPO) and Recovery Time Objectives (RTO).
  • Drive compliance and reliability initiatives, integrating DR best practices into system architecture and leveraging Chaos Engineering to validate failure scenarios.
  • Build scalable automation frameworks for testing, incident simulation, and recovery orchestration, reducing manual effort and improving operational efficiency.
  • Collaborate cross-functionally with engineering, product, and infrastructure teams to embed operational readiness into development lifecycles.
  • Lead incident/DR response drills and chaos experiments, analyzing test results, documenting findings, and implementing proactive improvements.
  • Monitor system performance and availability, developing dashboards and observability tools to provide actionable insights for reliability improvements.
  • Mentor engineers and foster a culture of resilience, promoting best practices in system design, testing, and disaster recovery preparedness.

  • Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise

  • 3+ years of experience in software development, reliability engineering, systems engineering, or non-functional testing, with a focus on disaster recovery, backup, and cloud resilience.
  • Proficiency in Golang and hands-on experience with version control systems such as Git or GitLab, ensuring maintainable and scalable code.
  • Strong understanding of microservices architecture and best practices for designing resilient, distributed systems in cloud environments.
  • Experience with CI/CD pipelines, ensuring automation, quality, and reliability in software delivery.
  • Exposure to cloud platforms (AWS, Azure, or GCP) and container orchestration technologies like Nomad or Kubernetes.
  • Strong collaboration and communication skills, with the ability to work cross-functionally and articulate technical concepts to diverse teams.
  • Commitment to continuous learning in reliability engineering, with an interest in enhancing disaster recovery strategies and system resilience.
  • Customer-centric and systems-thinking mindset, focused on delivering high-quality, scalable, and fault-tolerant solutions.

  • Preferred technical and professional experience

  • You have experience using HashiCorp products (Terraform, Packer, Waypoint, Nomad, Vault, Boundary, Consul).
  • Exposure to disaster recovery domain or worked on any product testing for DR is a plus
  • Mock Interview

    Practice Video Interview with JobPe AI

    Start Job-Specific Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Golang Skills

    Practice Golang coding challenges to boost your skills

    Start Practicing Golang Now
    IBM logo
    IBM

    Information Technology

    Armonk

    RecommendedJobs for You