Site Reliability Engineer (SRE)

8 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a seasoned

Site Reliability Engineer (SRE)

with a solid background in

payment systems

and high-availability architectures. The ideal candidate will have hands-on experience managing large-scale, distributed systems in production, with a deep understanding of reliability, scalability, and performance tuning in the financial services or payments industry.

Key Responsibilities

  • Design, build, and maintain scalable, resilient, and secure infrastructure for high-volume payment platforms.
  • Ensure system uptime, reliability, and performance through effective monitoring, alerting, and incident response strategies.
  • Collaborate with software engineering and DevOps teams to implement CI/CD pipelines and improve deployment efficiency.
  • Automate infrastructure management tasks using Infrastructure-as-Code (IaC) tools (Terraform, Ansible, etc.).
  • Proactively identify and mitigate system bottlenecks, failures, and potential points of failure.
  • Manage disaster recovery strategies, failover planning, and performance testing for critical payment services.
  • Work with development teams to ensure services are designed for reliability, scalability, and observability from the ground up.
  • Participate in root cause analysis and post-incident reviews to prevent future outages.

Required Skills & Experience

  • 8+ years of overall experience in infrastructure engineering or SRE roles, with at least 3+ years in the payments/fintech domain.
  • Strong understanding of payment protocols (UPI, IMPS, RTGS, NEFT, SWIFT, etc.) and transaction processing systems.
  • Proven expertise in Linux systems administration, cloud platforms (AWS, GCP, or Azure), and container orchestration (Kubernetes).
  • Solid experience with monitoring/logging tools like Prometheus, Grafana, ELK Stack, Splunk, etc.
  • Proficiency in one or more scripting languages (Python, Shell, Go, etc.) for automation.
  • Experience with incident management, SLAs, and system troubleshooting in high-pressure environments.
  • Familiarity with security and compliance practices in the financial sector (e.g., PCI-DSS, ISO 27001).

Preferred Qualifications

  • Previous experience supporting mission-critical applications in banking or financial services.
  • Exposure to Kafka, Redis, or other real-time streaming and caching technologies.
  • Experience with Site Reliability Engineering principles and implementing SLOs/SLIs.
  • Understanding of the Error Budget (EL) concept and how it ties into availability and release decisions.
  • Experience on any performance testing tool like K6, JMeter, LoadRunner.
  • Familiarity with mocking tools like Mockito, WireMock, Microcks.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Impronics Technologies logo
Impronics Technologies

IT Services and IT Consulting

Sunnyvale California

RecommendedJobs for You