Site Reliability Engineer

2 years

8 - 16 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

The

Software Engineer – SRE

will be responsible for building and maintaining highly reliable, scalable, and secure infrastructure that powers the

Albert

platform. This role focuses on

automation, observability, and operational excellence

to ensure seamless deployment, performance, and reliability of core platform services.

Key Responsibilities

  • Act as a passionate representative of the Albert product and brand.
  • Collaborate with Product Engineering and other stakeholders to plan and deliver core platform capabilities that enable scalability, reliability, and developer productivity.
  • Work with the Site Reliability Engineering (SRE) team on shared full-stack ownership of a collection of services and/or technology areas.
  • Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of all microservices.
  • Design and deliver the mission-critical stack, focusing on security, resiliency, scale, and performance.
  • Take ownership of end-to-end performance and operability.
  • Apply strong knowledge of automation and orchestration principles.
  • Serve as the ultimate escalation point for complex or critical issues not yet documented as Standard Operating Procedures (SOPs).
  • Troubleshoot and define mitigations using a deep understanding of service topology and dependencies.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • 2+ years of software engineering experience, with at least 1 year in an SRE role focused on automation.
  • Strong experience in Infrastructure as Code (IAC), preferably using Terraform.
  • Proficiency in Python or Node.js, with experience designing RESTful APIs and working in microservices architecture.
  • Solid expertise in AWS cloud infrastructure and platform technologies including APIs, distributed systems, and microservices.
  • Hands-on experience with observability stacks, including centralized log management, metrics, and tracing.
  • Familiarity with CI/CD tools (e.g., CircleCI) and performance testing tools like K6.
  • Passion for bringing automation and standardization to engineering operations.
  • Ability to build high-performance APIs with low latency (
  • Ability to work in a fast-paced environment, learning from peers and leaders.
  • Demonstrated ability to mentor other engineers and contribute to team growth, including participation in recruiting activities.
  • Good to Have

    • Experience with Kubernetes and container orchestration.
    • Familiarity with observability tools such as Prometheus, Grafana, OpenTelemetry, or Datadog.
    • Experience building Internal Developer Platforms (IDPs) or reusable frameworks for engineering teams.
    • Exposure to ML infrastructure or data engineering workflows.
    • Experience working in compliance-heavy environments (e.g., SOC2, HIPAA).
    Skills:- Automation, Terraform, Python, NodeJS (Node.js) and Amazon Web Services (AWS)

    Mock Interview

    Practice Video Interview with JobPe AI

    Start Python Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now

    RecommendedJobs for You

    hyderabad, chennai, bengaluru

    hyderabad, bangalore rural, bengaluru