Lead Platform Engineer, Site Reliability Engineering

5 - 10 years

22 - 27 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

About the Role:
Mastercard Site Reliability Engineering teams strive to ensure our customers have an experience that just works, every time by ensuring all aspects of our infrastructure and technology ecosystem are maintained to the highest standards and compliant with stringent security requirements. SRE within Mastercard's Availability Zone Operations is focused on ensuring core infrastructure, networks and supporting services, upon which our applications depend, operate with excellence and enable the applications deployed within to deliver a superb customer experience.In this role, the selected candidate will join our Availability Zone Operations group with responsibilities to continuously assess and enhance service quality of hypervisor and container orchestration infrastructure in order to advise key stakeholders on most effective utilization of available resources, resiliency of deployment topologies, capacity forecasting and performance patterns. Key Responsibilities:Lead continuous assessment processes over an extensive suite of VMWare hypervisors & Pivotal Cloud Foundry clusters servicing critical Mastercard applications for health, performance, capacity and liase with product/development teams to forecast growth requirements.Seek out inconsistent or illogical system configurations relative to their intended business purpose and contribute to broader automation efforts to ensure robust and sustainable configuration management of the estate.Conduct regular reviews of any incident events with Business Operations teams to ensure root causes are always identified and where patterns of failure, or compatibility issues between software and infrastructure are detected, formulate strategies to remediate or mitigate.Assess vendor software roadmaps to advise on potential risks or opportunities for enhancement arising from upcoming releases.Drive observability as a core principle for infrastructure services, assessing environments and technologies to identify gaps in monitoring/alerting practices and propose strategies to close.Develop testing and validation plans for new environment builds, disaster recovery exercises and post-maintenance activities to certify environment readiness before customer traffic is routed to it.All about you:5-10 years of experience as a platform engineer responsible for virtualization and serverless technologies and practices, VMware/CloudFoundry preferred. Intermediate knowledge of core operating system dependencies and functions, LDAP/DNS/NTP/Selinux/IPTables.Intermediate knowledge of TCP/IP Local Area Network configuration and troubleshooting practices for IPv4 and IPv6 networks.Experience with automation technologies such as Chef, Ansible and Terraform highly regarded.Experience with Application Performance Monitoring tools (Dynatrace/Datadog etc..) and familiarity with OpenTelemetry framework highly regarded.Familiarity with container orchestration systems systems, Docker/Kubernetes, is also desirableExcellent communications skills and experience co-ordinating complex troubleshooting efforts or product research with vendors.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Mastercard logo
Mastercard

IT Services and IT Consulting

Purchase NY

RecommendedJobs for You