Site Reliability Engineer - Database

PhonePe

4 - 8 years

0 Lacs

karnataka

Posted:1 month ago| Platform: Shine logo

Apply

Skills Required

mysql database administration bash python nosql communication skills linux systems administration innodb storage engine galera clusters infrastructureascode tools observability tools

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: PhonePe Limited is looking for a skilled Site Reliability Engineer - Database with 4 to 8 years of experience to join their team. As a Site Reliability Engineer, you will be responsible for the design, provisioning, and lifecycle management of large-scale MySQL/Galera multi-master clusters across multiple geographic locations. Your role will involve ensuring the resilience, scalability, and performance of the distributed, high-volume database infrastructure while driving strategic improvements to the infrastructure. Key Responsibilities: - Lead the design, provisioning, and lifecycle management of large-scale MySQL/Galera multi-master clusters across multiple geographic locations. - Develop and implement database reliability strategies, including automated failure recovery and disaster recovery solutions. - Investigate and resolve database-related issues, including performance problems, connectivity issues, and data corruption. - Own and continuously improve performance tuning, including query optimization, indexing, and resource management, security hardening, and high availability of database systems. - Standardize and automate database operational tasks such as upgrades, backups, schema changes, and replication management. - Drive capacity planning, monitoring, and incident response across infrastructure. - Proactively identify, diagnose, and resolve complex production issues in collaboration with the engineering team. - Participate in and enhance on-call rotations, implementing tools to reduce alert fatigue and human error. - Develop and maintain observability tooling for database systems. - Mentor and guide junior SREs and DBAs, fostering knowledge sharing and skill development within the team. Qualifications Required: - Expertise in Linux systems administration, scripting (Bash/Python), file systems, disk management, and debugging system-level performance issues. - 4+ years of hands-on experience in MySQL database administration in large-scale, high-availability environments. - Deep understanding of MySQL internals, InnoDB storage engine, replication mechanisms (async, semi-sync, Galera), and tuning parameters. - Proven experience managing 100+ production clusters and databases larger than 1TB in size. - Hands-on experience with Galera clusters is a strong plus. - Familiarity with Infrastructure-as-Code tools like Ansible, Terraform, or similar. - Experience with observability tools such as Prometheus, Grafana, or Percona Monitoring & Management. - Exposure to other NOSQL (e.g., Aerospike) will be a plus. - Experience working in on-premise environments is highly desirable. (Note: The additional details of the company were not present in the provided job description.),

More Jobs at PhonePe

Intern CTM

Bengaluru, Karnataka

Experience: Not specified

Salary: Not disclosed

PREMIUM ACCOUNTING EXECUTIVE

Ahmedabad, Gujarat, India

Experience: Not specified

Salary: Not disclosed

Grievance Advisor

Bengaluru, Karnataka

Experience: Not specified

Salary: Not disclosed

Software Engineer - Backend (7-10 years), Pune

Pune, Maharashtra, India

Experience: Not specified

Salary: Not disclosed

Area Manager - Operations - Noida

Greater Delhi Area

3 - 5 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.