Posted:-1 days ago| Platform:
Hybrid
Full Time
Primary Responsibilities: Defining and setting up best industry alert and monitoring practices across line of business and design/architect efficient monitoring dashboards on Splunk/Dynatrace /Grafana common for all applications/products across line of business Participating in 5-9 program and other peak season readiness initiatives and collaboration with application teams evaluating applications from resiliency, availability, and reliability perspective Act as a gatekeeper for changes rolling into production Embrace continuous learning of engineering practices to ensure industry best practices and technology adoption, including DevOps, Cloud and Agile thinking Tech debt reduction/Tech transformation including opensource/inner source adoption, Cloud adoption, HCP assessment and adoption Improve processes/runbooks and lead automation efforts of any manual items around support cutting down manual toil Participate in on-call rotation Improve operational tooling, frameworks, perform chaos engineering activities Respond to platform emergencies, alerts, and escalations from Customer Support Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so Qualifications - External Required Qualifications: Bachelor's degree required from related field 10+ years of experience in IT industry across entire SDLC 5+ years of experience in integrating monitoring and alerting into cloud software solutions 3+ years of coding experience with one or more of the follow languages Java, C#, C/C++, Go, Python, Perl, PowerShell or JavaScript with a willingness and ability to learn new ones 3+ years of experience in Splunk / Dynatrace / DataDog/Grafana/ Telemetry or similar for monitoring tools 2+ years of experience building and programmatically consuming REST APIs Work experience as a Site Reliability Engineer or similar role Experience with programmatic interaction with a relational database SQL Server/MySQL/PostgreSQL Experience planning and supporting 99.999% availability against critical applications in production Experience with any database Experience in operations support for any application ServiceNow experience Solid understanding of engineering fundamentals: unit testing, performance testing, code reviews, telemetry, agile and DevOps Solid understanding of: continuous integration / continuous delivery tools, serverless architecture, containerization, public / private cloud, application observability and/or messaging / stream architecture Technical writing skills (creating flow diagrams, end user documentation, etc) Knowledge of any scripting or programming language Proven ability to communicate effectively to both technical and non-technical, globally distributed audiences
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
25.0 - 35.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.
Salary: Not disclosed
9.0 - 19.0 Lacs P.A.
Gurugram, Haryana, India
Salary: Not disclosed
5.5 - 8.0 Lacs P.A.
Experience: Not specified
6.6 - 9.6 Lacs P.A.
0.43741 - 1.05 Lacs P.A.
Kochi, Kerala, India
Salary: Not disclosed
Gurugram
8.0 - 12.0 Lacs P.A.