T25 Senior Platform Reliability Engineer

5 - 7 years

0 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

At eBay, we're more than a global ecommerce leader we're changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We're committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.Our customers are our compass, authenticity thrives, bold ideas are welcome, and everyone can bring their unique selves to work every day. We're in this together, sustaining the future of our customers, our company, and our planet.Join a team of passionate thinkers, innovators, and dreamers and help us connect people and build communities to create economic opportunity for all.

Team Overview

Join the Marketing Technologies Platform Team that powers billions of communications per day sent to customers across the world. This team plays a pivotal role in delivering personalized and timely customer engagement experiences across eBay's global user base.

Role Overview

We are seeking a highly motivated and experienced Senior Platform Reliability Engineer (PRE) to join our growing team. In this critical role, you will be responsible for ensuring the reliability, scalability, and performance of our core platform and services. You will apply Site Reliability Engineering (SRE) principles to automate operations, improve system resilience, and drive a culture of continuous improvement across our engineering organization.

Responsibilities

  • Reliability & Performance: Design, implement, and maintain systems and processes to ensure the high availability, performance, and scalability of our production platform.
  • Automation: Develop and implement automation for infrastructure provisioning, deployment, monitoring, and incident response, reducing manual toil and improving operational efficiency.
  • Observability: Implement and enhance comprehensive monitoring, logging, and alerting solutions to provide deep insights into system health and performance.
  • Incident Management: Lead incident response efforts, conduct root cause analyses, and implement preventative measures to minimize future occurrences.
  • Capacity Planning: Collaborate with development teams to forecast resource needs and ensure the platform can handle anticipated growth and traffic spikes.
  • System Design & Architecture: Provide input on system architecture and design, advocating for reliability, scalability, and operational best practices from the outset.
  • Tooling & Infrastructure: Evaluate, select, and implement new tools and technologies to improve our platform's reliability, security, and operational capabilities.
  • Collaboration & Mentorship: Work closely with development, QA, and security teams to embed reliability practices throughout the software development lifecycle. Mentor junior engineers on SRE principles and best practices.
  • Documentation: Create and maintain clear, concise documentation for systems, processes, and troubleshooting guides.

Qualifications

  • Experience: 5+ years of experience in a DevOps, SRE, or similar role focused on platform reliability and operations.
  • Cloud Platforms: Strong hands-on experience with at least one major cloud provider (e.g., AWS, Azure, GCP).
  • Containerization & Orchestration: Expertise with Docker and Kubernetes for deploying and managing microservices.
  • Infrastructure as Code: Proficiency with IaC tools such as Terraform, CloudFormation, or Ansible.
  • Scripting & Programming: Strong scripting skills (e.g., Python, Bash) and experience with at least one compiled language (e.g., Go, Java, Node.js) for automation and tool development.
  • Monitoring & Alerting: Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog, New Relic) and logging systems (e.g., ELK Stack, Splunk).
  • CI/CD: Solid understanding and experience with CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions).
  • AI Code Generation: Familiarity with foundational AI concepts and practical experience applying AI-powered coding generation (e.g., OpenAI Codex, GitHub Copilot, Anthropic Claude, Cursor, Windsurf or understanding of transformer-based code generation) will be a significant asset.
  • Networking: Fundamental understanding of networking concepts (TCP/IP, DNS, Load Balancing, Firewalls).
  • Databases: Familiarity with database operations, performance tuning, and backup/recovery strategies (SQL and NoSQL).
  • Problem-Solving: Exceptional analytical and troubleshooting skills, with a methodical approach to identifying and resolving complex system issues.
  • Communication: Excellent verbal and written communication skills, capable of effectively communicating technical concepts to diverse audiences.
Education: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Please see the Talent Privacy Notice for information regarding how eBay handles your personal data collected when you use the eBay Careers website or apply for a job with eBay.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You