Posted:3 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Site Reliability Engineer III at JPMorgan Chase within the Corporate Technology - Capital Management, you play a crucial role in shaping the future of a globally recognized organization. Your impact is direct and significant in a sphere tailored for high achievers in site reliability. You will tackle complex and wide-ranging business challenges with simple and effective solutions through code and cloud infrastructure. Your responsibilities include configuring, maintaining, monitoring, and optimizing applications and associated infrastructure. You will independently break down and enhance existing solutions iteratively, making you a key contributor to your team. Your primary responsibilities involve driving continuous enhancement of reliability, monitoring, and alerting for mission-critical microservices. You will automate tasks to reduce manual effort, creating reliable infrastructure and tools to expedite feature development. By developing and implementing metrics for microservices, defining user journeys, SLOs, and error budgets, and configuring dashboards and alerts, you ensure blameless post-mortems for permanent incident closure. Collaboration with development teams throughout the software lifecycle is essential to enhance reliability and scale, design self-healing patterns, and implement infrastructure, configuration, and network as code. You will work closely with software engineers to design and implement deployment approaches using automated CI/CD pipelines and promote site reliability engineering best practices. Your role involves demonstrating and advocating for a site reliability culture and practices, leading initiatives to improve application and platform reliability and stability through data-driven analytics. Collaborating with team members to identify service level indicators, establish reasonable service level objectives, and proactively resolve issues before customer impact are critical aspects of your work. Additionally, you will act as the main point of contact during major incidents, utilizing technical expertise to swiftly identify and resolve issues while sharing knowledge within the organization. To excel in this role, you are required to have formal training or certification in site reliability concepts along with at least 5 years of applied experience in public cloud platforms like AWS, Azure, or GCP. Proficiency in a programming language such as Python, Go, or Java/Spring Boot is necessary, with expertise in software design, coding, testing, and delivery. Experience with Kubernetes, cloud computing, and relational databases like Oracle or MySQL is preferred. You should possess excellent debugging and troubleshooting skills and familiarity with common SRE toolchains like Grafana, Prometheus, ELK Stack, Kibana, and Jaeger. Proficiency in continuous integration and continuous delivery tools such as Jenkins, GitLab, or Terraform, and observability tools like Dynatrace, Datadog, New Relic, CloudWatch, or Splunk is also important. Moreover, your skills should include familiarity with ETL tools like Databricks, experience with container and container orchestration technologies such as ECS, Kubernetes, and Docker, and a deep proficiency in reliability, scalability, performance, security, enterprise system architecture, and toil reduction. You should be able to identify and solve problems related to complex data structures and algorithms, troubleshoot common networking technologies and issues, and be driven to self-educate and evaluate new technologies. Teaching new programming languages to team members, contributing to large and collaborative teams, recognizing roadblocks proactively, and showing interest in learning technology that drives innovation are further expectations of this role.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Hyderabad, Telangana, India