4 - 8 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: As a Software Engineer II on the Site Reliability Engineering (SRE) team at Electronic Arts, you will contribute to the design, automation, and operation of large-scale, cloud-based systems that power EA's global gaming platform. You will work closely with senior engineers to enhance service reliability, scalability, and performance across multiple game studios and services. Key Responsibilities: - Build and Operate Scalable Systems: Support the development, deployment, and maintenance of distributed, cloud-based infrastructure leveraging modern open-source technologies (AWS/GCP/Azure, Kubernetes, Terraform, Docker, etc.). - Platform Operations and Automation: Develop automation scripts, tools, and workflows to reduce manual effort, improve system reliability, and optimize infrastructure operations (reducing MTTD and MTTR). - Monitoring, Alerting & Incident Response: Create and maintain dashboards, alerts, and metrics to improve system visibility and proactively identify issues. Participate in on-call rotations and assist in incident response and root cause analysis. - Continuous Integration / Continuous Deployment (CI/CD): Contribute to the design, implementation, and maintenance of CI/CD pipelines to ensure consistent, repeatable, and reliable deployments. - Reliability and Performance Engineering: Collaborate with cross-functional teams to identify reliability bottlenecks, define SLIs/SLOs/SLAs, and implement improvements that enhance the stability and performance of production services. - Post-Incident Reviews & Documentation: Participate in root cause analyses, document learnings, and contribute to preventive measures to avoid recurrence of production issues. Maintain detailed operational documentation and runbooks. - Collaboration & Mentorship: Work closely with senior SREs and software engineers to gain exposure to large-scale systems, adopt best practices, and gradually take ownership of more complex systems and initiatives. - Modernization & Continuous Improvement: Contribute to ongoing modernization efforts by identifying areas for improvement in automation, monitoring, and reliability. Qualifications: - Software Engineer II (Site Reliability Engineer) with 3-5 years of experience in Cloud Computing (AWS preferred), Virtualization, and Containerization using Kubernetes, Docker, or VMWare. - Extensive hands-on experience in container orchestration technologies, such as EKS, Kubernetes, Docker. - Experience supporting production-grade, high-availability systems with defined SLIs/SLOs. - Strong Linux/Unix administration and networking fundamentals (protocols, load balancing, DNS, firewalls). - Hands-on experience with Infrastructure as Code and automation tools such as Terraform, Helm, Ansible, or Chef. - Proficiency in Python, Golang, Bash, or Java for scripting and automation. - Familiarity with monitoring and observability tools like Prometheus, Grafana, Loki, or Datadog. - Exposure to distributed systems, SQL/NoSQL databases, and CI/CD pipelines. - Strong problem-solving, troubleshooting, and collaboration skills in cross-functional environments.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

hyderabad, telangana

hyderabad, telangana

hyderabad, telangana, india

hyderabad, telangana

arunachal pradesh, india

daman and diu, india