Senior Site Reliability Engineer

8 years

0 Lacs

Posted:14 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Site Reliability Engineer (SRE)

Key Responsibilities:

  • Design, build, and maintain scalable, resilient, and secure infrastructure for high-volume payment platforms.
  • Ensure system uptime, reliability, and performance through effective monitoring, alerting, and incident response strategies.
  • Collaborate with software engineering and DevOps teams to implement CI/CD pipelines and improve deployment efficiency.
  • Automate infrastructure management tasks using Infrastructure-as-Code (IaC) tools (Terraform, Ansible, etc.).
  • Proactively identify and mitigate system bottlenecks, failures, and potential points of failure.
  • Manage disaster recovery strategies, failover planning, and performance testing for critical payment services.
  • Work with development teams to ensure services are designed for reliability, scalability, and observability from the ground up.
  • Participate in root cause analysis and post-incident reviews to prevent future outages.

Required Skills & Experience:

  • 8+ years of overall experience in infrastructure engineering or SRE roles, with at least 3+ years in the

    payments/fintech domain

    .
  • Strong understanding of

    payment protocols

    (UPI, IMPS, RTGS, NEFT, SWIFT, etc.) and transaction processing systems.
  • Proven expertise in

    Linux systems administration

    , cloud platforms (AWS, GCP, or Azure), and container orchestration (Kubernetes).
  • Solid experience with monitoring/logging tools like

    Prometheus, Grafana, ELK Stack, Splunk

    , etc.
  • Proficiency in one or more scripting languages (Python, Shell, Go, etc.) for automation.
  • Experience with

    incident management

    , SLAs, and system troubleshooting in high-pressure environments.
  • Familiarity with security and compliance practices in the financial sector (e.g., PCI-DSS, ISO 27001).

Preferred Qualifications:

  • Previous experience supporting mission-critical applications in

    banking or financial services

    .
  • Exposure to

    Kafka

    ,

    Redis

    , or other real-time streaming and caching technologies.
  • Experience with Site Reliability Engineering principles and implementing

    SLOs/SLIs

    .
  • Understanding of the

    Error Budget (EL)

    concept and how it ties into availability and release decisions.
  • Experience on any performance testing tool like

    K6, JMeter, LoadRunner

    .
  • Familiarity with mocking tools like

    Mockito, WireMock, Microcks

    .


Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Impronics Technologies logo
Impronics Technologies

IT Services and IT Consulting

Sunnyvale California

RecommendedJobs for You