Posted:22 hours ago|
Platform:
On-site
Full Time
Role Overview: The Support Analyst will play a pivotal role in ensuring the reliability, availability, and performance of our microservices-based systems. This role requires a deep understanding of Site Reliability Engineering (SRE) principles, hands-on experience with cloud platforms (preferably AWS), automation tooling, and monitoring frameworks. The Support Analyst will also lead a team of SREs (Support Reliability Engineers) from partner organizations and work in close collaboration with Development, DevOps, and other cross-functional teams to ensure a resilient and scalable platform. Key Responsibilities: Reliability & Performance Lead initiatives to enhance the reliability, availability, and scalability of microservices-based systems. Drive continuous improvements through performance analysis and architectural refinement. Incident Management Design and enforce incident management protocols to reduce service disruptions. Coordinate with relevant teams during major incidents for rapid resolution and root cause analysis. Monitoring & Alerting Develop and implement comprehensive monitoring and alerting solutions using tools like AppDynamics, ELK, and AWS CloudWatch. Ensure proactive detection and resolution of system anomalies. Automation Identify opportunities for automation in operational workflows, deployment, and system recovery. Develop scripts and tools to automate repetitive tasks. Capacity Planning & Optimization Collaborate on capacity planning based on system usage patterns and projected growth. Identify system bottlenecks and implement performance tuning solutions. Team Leadership & Collaboration Lead and mentor a team of partner SREs, infrastructure engineers, and collaborate closely with development teams. Foster a culture of ownership, continuous learning, and operational excellence. Technical Skills: Architecture: Strong knowledge of Microservices-based architectures Cloud: Proficient in AWS ecosystem Monitoring: Expertise in tools such as AppDynamics, ELK stack, CloudWatch Automation: Scripting (Python, Shell, etc.), CI/CD tools DevOps Collaboration: Familiarity with Infrastructure-as-Code, containerization (Docker/Kubernetes), GitOps Behavioural Competencies: Leadership Ability: Capable of guiding cross-functional teams toward operational excellence. Problem Solving: Analytical mindset for diagnosing and resolving complex system issues. Customer Empathy: Keen understanding of customer impact and user experience. Communication: Ability to effectively communicate across technical and non-technical teams. Indicative Activities: Lead reliability improvement projects across microservices landscape. Establish and govern incident response processes. Create and maintain dashboards, alerts, and metrics. Perform root cause analysis and implement long-term fixes. Guide partner teams in best practices and tools for SRE. Engage in post-incident reviews and drive blameless retrospectives.
Maruti Suzuki
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
1.0 - 5.0 Lacs P.A.
Gurgaon, Haryana, India
4.0 - 8.0 Lacs P.A.
Bengaluru
3.0 - 7.0 Lacs P.A.
Bengaluru, Karnataka, India
Salary: Not disclosed
Thane, Maharashtra, India
4.0 - 9.0 Lacs P.A.
Thane, Navi Mumbai, Mumbai (All Areas)
4.0 - 9.0 Lacs P.A.
Mumbai
3.0 - 6.0 Lacs P.A.
Chennai
3.0 - 6.0 Lacs P.A.
Mumbai
5.0 - 8.0 Lacs P.A.
Mumbai
3.0 - 6.0 Lacs P.A.