Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in hubli
>
Han Digital Solution
>
SRE (Application and Infrastructure Monitoring)

SRE (Application and Infrastructure Monitoring)

Han Digital Solution

4 - 6 years

12 - 16 Lacs

hubli mangaluru mysuru bengaluru belgaum

Posted:1 month ago| Platform:

Apply

Skills Required

unix solution architecture basic automation event management windows data visualization sql python capacity planning

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Description

Required Information

Details

Role

SRE (Application and Infrastructure Monitoring)

Required Technical Skill Set

Knowledge on Infrastructure

No. of Requirements

Desired Experience Range

4-6 years

Location of Requirement

Pune, Indore, Kochi

Desired Competencies (Technical/Behavioral Competency )

Must-Have

To Detect the Incidents and act proactively escalate using the built in dashboards.
- Hands on using Dynatrace dashboards and creation of customized dashboards.
- Hands on using ServiceNow to perform analytics, Reporting, knowledge management, CMDB, ITOM modules(Event Management, Operator Workspace etc)
- Basic knowledge on other monitoring tools would be advantage (SolarWinds, Nimsoft, SCOM, Redgate etc.)
- Basic understanding of Application Architecture and its infra components, so that the user impact can be understood.
Troubleshooting/communication skills:
- Hands on experience in troubleshooting basic issues in Windows, Unix
- Able to write and understand basic command lines and scripting
- Able to communicate effectively during the incident and Problem management calls

Good-to-Have

Dynatrace Admin skills
Knowledge on integration platforms like MQ,APIC etc
Service now Fundamentals or Admin knowledge or Developer certified
Knowledge on the Data visualization tools(Eg. Power BI)
Knowledge on IaC(Ansible or Terraform)
Scripting : Python or PowerShell

Basic understanding of Autosys Batches and its components
Experience in working with Batch management(monitoring and incident resolution)
Basic knowledge on AWS/Azure to support the access management tasks.

Database

Understanding of Relational Database(any one)
Able to write& understand basic queries using SQL.

Service now Fundamentals or Admin knowledge or certified

Understanding of Devops practices and tools

Type

Details of The Role (For Candidate Briefing)

Reporting To Which Role

SRE Engineer

Size of the Team, if any Reporting to this Role

6-8 Years

On-site Opportunity

Unique Selling Proposition (USP) of The Role

Banking Domain

Details of The Project (A short Briefing on the Project may be attached with this document for candidate- briefing). It may be shared with external stakeholders like job-agencies etc.

We are looking for a ServiceNow Developer to work closely with Product Owners, Solution Architects and Analysts to refine epics, user stories and translate solution architecture into technical design and working software, lead technical teams by setting high development standards and applying industry best practices.

Experience:

As mentioned above

Sample Questions:

Monitoring & Observability Strategy

How would you design a comprehensive monitoring strategy for a large-scale distributed application
Expected Response: Knowledge of key monitoring pillars (metrics, logs, traces), selecting the right tools (Prometheus, Grafana, Datadog, New Relic, ELK, Splunk), and defining SLIs, SLOs, and SLAs.

Incident Response & Root Cause Analysis

An application is experiencing intermittent latency issues. How would you troubleshoot and identify the root cause
Expected Response: Use of APM tools, log correlation, distributed tracing, dependency mapping, anomaly detection, and defining runbooks for incident response.

Automation & Self-Healing Systems

How can you leverage automation to improve monitoring efficiency and reduce MTTR (Mean Time to Recovery)
Expected Response: Experience with auto-remediation via Terraform/Ansible, self-healing scripts, predictive monitoring (AI/ML-based alerts), and implementing auto-scaling mechanisms.

Infrastructure & Cloud Monitoring

How do you monitor and optimize cloud infrastructure in an AWS/Azure/GCP environment
Expected Response: Expertise in CloudWatch, Azure Monitor, GCP Operations Suite, cost monitoring, setting up synthetic monitoring, and best practices for alerting thresholds.

Performance Optimization & Capacity Planning

How do you ensure high availability and optimal performance in a large-scale production environment
Expected Response: Understanding of load balancing, caching strategies, capacity planning using historical trends, chaos engineering for resilience, and observability-driven scaling.

DevOps - Site Reliability Engineer (SRE) - Cloud Infrastructure & Data

Kubernetes (EKS) clusters
Terraform AWS infrastructure Kubernetes administration, data pipelines

ExcessiveUppercase

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Han Digital Solution

Information Technology

Metro City

Login to

Please Verify Your Phone or Email

Confirm Action

SRE (Application and Infrastructure Monitoring)