Observability Admin

4 - 6 years

2 - 5 Lacs

Posted:17 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Description
SN
Required Information
Details
1
Role
Observability Admin
2
Required Technical Skill Set (Skill Name)
Observability Admin
3
No. of Requirements
2
4
Desired Experience Range
4-6 Year
5
Location of Requirement
Pune/ Indore/ Kochi
6
Keywords
Observability Admin
7
Technical SME details for skill clarification
Emp ID: 686536
Email:
Desired Competencies (Technical/Behavioral Competency )
Must-Have
Minimum 5 mandate details are mandate with two or 3 liners
  1. Observability SME should have a strong background in distributed systems, cloud technologies, and a proficiency in tools like Dynatrace, Prometheus, Grafana, ELK stack, or similar.
  2. In-depth knowledge of distributed systems, microservices architecture, and cloud platforms
  3. Exceptional communication skills are crucial for effectively conveying complex technical concepts to both technical and non-technical stakeholders. Overall, the role plays a pivotal part in enhancing the reliability, scalability, and performance of systems through advanced observability solutions.
  4. Expertise in scripting and programming languages (e.g., Python, Go, Java).
  5. Experience with containerization and orchestration technologies (Docker, Kubernetes) is a plus.
  6. Proficiency in monitoring tools, incident management, and other relevant technologies.
  7. Strong communication skills to effectively collaborate with diverse teams and convey technical information to non-technical stakeholders.
  8. Problem-solving mindset with the ability to make sound decisions under pressure.
Good-to-Have
Minimum 2 mandate details are mandate with two or 3 liners
  1. Training and Development:
  2. Foster a culture of continuous learning within the team. Provide mentorship and training to team members to enhance their technical skills and knowledge.
SN
Role descriptions / Expectations from the Role
1
Collaboration with all application tech families :
Collaborate with cross-functional teams to understand their value chain and to evaluate and adopt emerging technologies and best practices to enhance system observability.
2
Monitoring and Metrics:
Define and implement comprehensive monitoring and alerting strategies for complex distributed systems. Work closely with tools teams and application teams to establish and enforce best practices for logging, tracing, and monitoring.
3
Tooling and Technology:
Assess, select, and implement observability tools and technologies that align with the organizations goals and requirements. Stay abreast of industry trends and advancements in observability to ensure our systems are leveraging the latest innovations.
4
Performance Optimization:
Identify and address performance bottlenecks and inefficiencies in collaboration with development and operations teams. Conduct regular performance reviews and implement optimizations to enhance system reliability and responsiveness.
5
Incident Response and Troubleshooting:
Collaborate with incident response teams to quickly diagnose and resolve production issues related to observability. Develop and maintain incident response playbooks to streamline troubleshooting processes.
Type
Details of the Role (For Candidate Briefing)
Reporting To Which Role
Azure DevOps Client Solution Architect
Size of the Team, if any Reporting to this Role
6
On-site Opportunity
Yes in future
Unique Selling Proposition (USP) of The Role
Client is on niche technologies to grow
Details of The Project (A short Briefing on the Project can be provided herewith. It may be shared with external stakeholders like job-agencies etc.
Customer needs Admin and Expert to Designing DevOps Strategy, Recommend a migration and consolidation strategy for DevOps tools, Design and implement an Agile work management approach, Manage code quality and security policies, Implement a build strategy, Implementing Continuous Delivery
Technical Questions & Answers :
  1. Observability Strategy & Implementation
Can you explain the key pillars of observability and how you would design an end-to-end observability solution for a distributed microservices-based system
Expected Response: Understanding of logs, metrics, traces , and their integration using tools like Dynatrace, Prometheus, Grafana, ELK stack, OpenTelemetry, etc.
  1. Monitoring & Performance Optimization
How would you set up proactive monitoring and alerting for a cloud-native application running on Kubernetes
Expected Response: Experience in Kubernetes monitoring with Prometheus, Grafana dashboards, Dynatrace auto-discovery, log aggregation using ELK, and alerting strategies with thresholds and anomaly detection .
  1. Troubleshooting & Incident Management
A critical microservice is experiencing intermittent high latency. How would you identify and resolve the issue using observability tools
Expected Response: Ability to analyze traces, correlate logs and metrics, use AIOps for root cause analysis, and leverage distributed tracing with OpenTelemetry or Dynatrace .
  1. Scripting & Automation for Observability
How would you automate the deployment and configuration of observability tools across multiple environments
Expected Response: Proficiency in Python, Go, or Java for writing custom monitoring scripts, automating observability setups with Terraform/Ansible , and using APIs for data ingestion and analysis.
  1. Communication & Stakeholder Engagement
How would you explain an observability-driven performance improvement strategy to a non-technical business stakeholder
Expected Response: Ability to translate technical insights into business impact , focusing on uptime, cost savings, performance optimization, and risk reduction , using clear visual dashboards and reports .

ExcessiveUppercase

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Han Digital Solution logo
Han Digital Solution

Information Technology

Metro City

RecommendedJobs for You

lakshadweep, chandigarh, new delhi, daman & diu, jammu

lakshadweep, chandigarh, new delhi, daman & diu, jammu

lakshadweep, chandigarh, new delhi, daman & diu, jammu

vijayawada, visakhapatnam, guntur, nellore

hubli, mangaluru, mysuru, bengaluru, belgaum