Site Reliability Engineer (SRE) - Guidewire Cloud Platform Tenancy

3 - 7 years

5 - 9 Lacs

Posted:3 hours ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

What You ll Do

  • Collaborate with engineering teams to provide feedback and contribute code where needed, enhancing product functionality and resilience.
  • Participate in on-call rotations to ensure 24x7 availability of services.
  • Design and develop tools to support 24x7 follow-the-sun operations for critical production systems.
  • Automate deployment tasks for core products and infrastructure, maintaining a robust automation framework.
  • Monitor and optimize the performance of applications on the Guidewire Cloud Platform, ensuring reliability and efficiency.
  • Develop and maintain observability tools, metrics, and dashboards, including self-healing mechanisms for increased reliability.
  • Foster a culture of reliability by promoting blameless postmortems, SLO tracking, and continuous learning from incidents.
  • Proactively identify and address infrastructure issues to minimize business impact.
  • Develop system documentation and training materials to empower and educate team members.

Who You Are

  • Skilled in programming with Python or Go for building internal tools, CLIs, and APIs; familiarity with Java and Spring Boot is a plus.
  • Exceptional troubleshooting skills, with a proactive, critical approach to solving complex issues.
  • Proficient in containerization technologies, with hands-on expertise in Docker, Helm, Kubernetes (EKS), CNI, and Ingress networking.
  • Strong knowledge of Kubernetes concepts (pods, deployments, services, statefulsets, ingress etc.) and the Operator pattern.
  • Experienced with Terraform, including developing and testing complex modules.
  • Advanced experience with AWS, including custom tool development using AWS SDK.
  • Solid understanding of Single Sign-On (SSO), SAML, and OAuth protocols; experience with Okta is a bonus.
  • Skilled in using observability tools such as Prometheus, OpenTelemetry, or Datadog for proactive monitoring.
  • Production-At-Scale support background in a heavily microservice-based world.
  • Familiar with agile methodologies, including Scrum and Kanban, to enhance software development processes.
  • Excellent communication skills, with the ability to explain complex technical concepts to diverse audiences.

Other Requirements

  • Bachelor s Degree in Computer Science or a related field.
  • Ability to read, write, and speak English
  • We provide 24x7 support to our customers, so we expect you to take turns with your teammates being on-call for weekend production emergencies or to provide rotating weekend operational support
  • Travel - Expect occasional travel (less than 5%) to other Guidewire offices for training and team meetings

Bonus Points

  • Kubernetes or AWS certifications
  • Contributions to open source projects
  • Familiar with Kubevela (OAM) or Crossplane for Kubernetes-native infrastructure management
  • Experience in managing large scale Aurora PostgreSQL clusters and Aurora Serverless
  • Experience with TeamCity CI or GitHub actions

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Guidewire Software logo
Guidewire Software

Insurance Technology

Walnut Creek

RecommendedJobs for You

mumbai, maharashtra, india

pune, maharashtra, india

pune, maharashtra