Posted:2 weeks ago|
Platform:
On-site
Full Time
Role Description Who we are: At UST, we help the world’s best organizations grow and succeed through transformation. Bringing together the right talent, tools, and ideas, we work with our client to co-create lasting change. Together, with over 30,000 employees in 25 countries, we build for boundless impact—touching billions of lives in the process. Visit us at . You Are Lead II - Cloud Infrastructure Services We are seeking a Principal App & Infra Monitoring to join our Engineering team. She/He will be focusing on enabling monitoring strategies across multiple groups. A key part of the role is to champion and lead enterprise monitoring center of excellence. The lead will leverage the full power of the on-prem and cloud to configure highly resilient and scalable applications that support zero downtime. She/ He will interface with business colleagues, IT team members, and external resources to gather requirements and implement the next generation of cloud capabilities, including enabling SaaS offerings to customers. We believe that hiring and developing the best talent will lead to a diversity of perspectives, ideas, and cultures to create better products and services as well as a welcoming and dynamic workplace. We look to our engineers to be versatile, display leadership qualities, and be enthusiastic about tackling new problems across the full stack as we continue pushing technology forward. A successful candidate will have an extensive understanding and passion for cloud technologies with proven experience around the delivery and support of ‘SaaS’ applications. A self-starter attitude, excellent communication, and dedication to innovative technologies are critical to this role The Opportunity Principal App & Infra Monitoring Lead Provision, configure, release, and maintain AWS cloud infrastructure as code using tools such as Terraform, and CloudFormation. Design and implement comprehensive monitoring frameworks for applications and infrastructure to ensure continuous performance and availability. Lead and mentor a team of monitoring specialists, fostering a culture of proactive issue identification and resolution. Analyze monitoring data to identify trends, bottlenecks, and potential system failures, implementing corrective actions as necessary. Oversee the incident response process, ensuring timely resolution of issues and minimizing impact on business operations. Work closely with application development, IT operations, and security teams to align monitoring practices with organizational goals and compliance requirements. Definition and deployment of systems for metrics, logging, and monitoring on the AWS platform Develop and present regular reports on system performance, incidents, and improvements to stakeholders, providing insights for decision-making Use ticket management systems such as Jira and Confluence to manage work priorities. Experience with Incident management, change management, and access management Stay updated with the latest monitoring tools and technologies, integrating them into existing systems to enhance monitoring capabilities. Apply networking knowledge to debug upstream issues. Actively and continually optimize AWS/cloud resources to ensure the lowest cost while ensuring security, scalability, and high availability Administer and troubleshoot Linux-based systems Evaluation of new technology alternatives and vendor products. Providing recommendations for architecture and process improvements. Perform off-hours maintenance and support of the platform. Qualifications: 7+ years in IT infrastructure, application support, or monitoring roles. 3+ years in a leadership or senior monitoring role, managing teams or large-scale monitoring initiatives. Experience in incident response, troubleshooting, root cause analysis, and performance tuning. Monitoring Tools Expertise: Experience with tools like Datadog, Splunk, Dynatrace, AppDynamics, New Relic, Nagios, Prometheus, Grafana, etc. Infrastructure Monitoring: Familiarity with server, network, and cloud monitoring solutions across AWS, Azure, GCP, or on-premise data centers. Application Performance Monitoring (APM): Understanding of application logs, metrics, tracing, and ing mechanisms. Scripting & Automation: Proficiency in Python, Shell, PowerShell, or Ansible for automating monitoring tasks. Cloud & DevOps Knowledge: Experience working with Kubernetes, Docker, Terraform, CI/CD pipelines, and observability tools. Database Monitoring: Knowledge of SQL and NoSQL databases and monitoring their performance. Strong analytical and problem-solving skills for detecting and resolving performance issues. Excellent communication and collaboration skills to work across teams (IT Ops, DevOps, Security, Application teams). Ability to lead and mentor a team of monitoring engineers. Experience in reporting and presenting monitoring insights to stakeholders and executives. Solid foundation of networking and Linux administration. Ability to learn/use various open-source technologies and tools. Experience with Atlassian tooling such as Jira and Confluence preferred. Other cloud experiences in Azure, GCP would be a plus Certifications would be a plus: ITIL, DevOps, AWS, Azure, GCP, Agile, PMP What Are We Looking For 6+ years of experience in Monitoring Monitoring Tools Expertise: Experience with tools like Datadog, Splunk, Dynatrace, AppDynamics, New Relic, Nagios, Prometheus, Grafana, etc. Application Performance Monitoring (APM): Understanding of application logs, metrics, tracing, and ing mechanisms. Infrastructure Monitoring What We Believe We’re proud to embrace the same values that have shaped UST since the beginning. Since day one, we’ve been building enduring relationships and a culture of integrity. And today, it's those same values that are inspiring us to encourage innovation from everyone to champion diversity and inclusion and to place people at the centre of everything we do. Humility We will listen, learn, be empathetic and help selflessly in our interactions with everyone. Humanity Through business, we will better the lives of those less fortunate than ourselves. Integrity We honour our commitments and act with responsibility in all our relationships. Equal Employment Opportunity Statement UST is an Equal Opportunity Employer. We believe that no one should be discriminated against because of their differences, such as age, disability, ethnicity, gender, gender identity and expression, religion, or sexual orientation. All employment decisions shall be made without regard to age, race, creed, colour, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. UST reserves the right to periodically redefine your roles and responsibilities based on the requirements of the organization and/or your performance To support and promote the values of UST. Comply with all Company policies and procedures Skills Monitoring,Framework,Cloud Show more Show less
UST
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Bengaluru, Karnataka, India
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed