Job
Description
About The Role :
Job TitleSite Reliability Engineer, AVP
LocationBangalore, India
Corporate TitleAVP
Role Description
Technology/Service is responsible for delivering the business vision and strategy, at a global level, focusing on achieving consistent operational excellence and client/user satisfaction through industrialisation, price/value optionality and leveraging increased automation and the use of technology. Work includesCreating a digital vision and strategy for the bank, and ensuring its integration with the organization's overall strategic plans
Identifying opportunities for differentiating the bank's digital portfolio including capabilities and solutionsActing as a change agent in leading the organizational changes that are required to create and maintain the necessary digital portfolioApplying extensive knowledge and understanding of the evolving digital market, acts as a thought leader on emerging digital trends related to technology and business
What well offer you
100% reimbursement under childcare assistance benefit (gender neutral)Sponsorship for Industry relevant certifications and educationAccident and Term life Insurance
Your key responsibilities
As
Senior Site Reliability Engineer youOrchestrate and contribute SRE activities across API Platforms and Integration servicesIntroduce all engineering disciplines that combine software- and systems engineering to build and run large-scale, massively distributed, fault-tolerant systemsImplement the core of DevOps with specific principles and practices, focusing on what and how to improve reliabilityEstablish and support capacity planning procedures and have a close eye on SLIs and SLOs for production readiness and in live environmentCoordinate with the rest of the division and the teams working on different layers of the application and infrastructure, and you have full commitment to collaboration on problem solvingFor
Infrastructure & Service ManagementyouEngage in and improve the whole lifecycle of services - from inception and design, deployment, operation, and refinementMaintain services once they are live by measuring and monitoring availability, latency, and overall system healthScale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocityDevelop and enforce policies, standards and guidelines for site reliabilityAutomate application and infrastructure deployment activities to production environments.For
Incident & Problem ManagementyouPerform troubleshooting & Emergency ResponseInvestigate root causes and suggest solutionsIncrease the productivity by leading blameless post-mortemsFor
Application MaintenanceyouCollaboratively work with Product Owners and Engineers to run reliable servicesConfigure and maintains application & monitoringIdentify business objects for monitoringTrack system performance, capacity, and use your experience to create effective strategies for maintaining and improving system performance and availability.For
Operational Continuous ImprovementyouIdentify issues and optimization potential and introduce related user storiesSupport with automation knowhow to reduce the risk of bad changesIdentify, design, develop, deploy tools and processes to monitor, maintain, and report site performance and availabilityFor
Service OnboardingyouSupport your Squad and your Chapter population in onboarding & promotions
Your skills and experiences
Expert hands-on experience with on-premisesExpert hands-on experience with cloud ecosystems run on Google CloudExpert hands-on experience with Docker / Kubernetes operations with GKE or similar technologyExpert experience with automated infrastructure provisioning based on Terraform/TerraGrunt, Terraform Enterprise, AnsibleAdvanced hands-on experience with Continuous Integration / Continuous Deployment (Github) and patterns for CI/CD pipelines.Advanced hands-on experience of monitoring tools like Prometheus, Grafana, Kibana and alerting tools like OpsGenie, NewRelic, DataDog, Splunk, Google Operations-Suite (Stackdriver)Very good knowledge of security capabilities (TLS, OAuth2, KMS, Vault, Admission Controllers, let's encrypt or similar technologies).Very good understanding of Microservice architectures and experience with API Management with Apigee or WSO2Experience in software development in at least one language (Java, JavaScript, Python, Go).Good Knowledge of the Software Development Life Cycle processes based on related tools such asTeamCity, BitBucket, ArtifactorySonarQube, VeraCode, CrucibleJIRA, Confluence, Service Now
How well support you
About us and our teams
Please visit our company website for further information:https://www.db.com/company/company.htm
We at DWS are committed to creating a diverse and inclusive workplace, one that embraces dialogue and diverse views, and treats everyone fairly to drive a high-performance culture. The value we create for our clients and investors is based on our ability to bring together various perspectives from all over the world and from different backgrounds. It is our experience that teams perform better and deliver improved outcomes when they are able to incorporate a wide range of perspectives. We call this #ConnectingTheDots.