Provide 247365 first-line operational monitoring and incident triage for payment and digital banking platforms. The L1 Engineer is responsible for early detection, logging, basic
troubleshooting via SOPs, communication, and timely escalation to L2 within SLA. The role is critical to maintaining platform availability, transaction success rates, and client confidence.
A. Monitoring & Event Detection (Primary)
1. Monitor real-time health dashboards for all products: Transaction success/failure trends, TPS, latency, timeouts, retries Channel availability (NPCI/UIDAI/bank/biller endpoints) Queue depth and consumer lag (application queues) Application service heartbeat / process health 2. Perform proactive detection of anomalies: sudden spikes in declines or reversals drop in TPS or throughput vs baseline scheduled job failures impacting settlements/billing/ auth 3. Maintain shift-wise monitoring checklist and signoffs. B. Incident Logging & First Response 1. Create, classify, and assign tickets in the ITSM tool: product, severity, client impacted, timeline, symptom, evidence 2. Execute standard SOP-based actions permitted for L1: service restarts through approved consoles post analysis basic queue purge/reroute actions where SOP allows process validation checks (CPU/memory from app dashboard only) configuration verification against runbooks (read-only) 3. Capture all logs/details required for L2 diagnosis: API error codes, timestamps, correlation IDs, switch response codes C. Client & Internal Communication 1. Provide immediate acknowledgments per SLA. 2. Draft and send incident notifications using approved templates: Initial alert, periodic update, restoration note. 3. Coordinate bridges and ensure stakeholders are updated. D. Escalation Management 1. Escalate to L2 within predefined thresholds: P1: immediate escalation (510 minutes) P2/P3: as per SOP and SLA. 2. Ensure no ticket is unresolved or unassigned during handover. 3. Support L2 with factual data and shift timeline. E. Shift Handover & Reporting 1. Maintain shift log with: open tickets, actions taken, pending client responses unusual monitoring events 2. Perform clean handover to next shift with written summary. 3. Participate in daily/weekly ops reviews as required. Required Skills & Competencies
Domain / Functional
Strong understanding of real-time transaction systems, basic payment flow concepts. Familiarity with at least one of: UPI/IMPS/cards/BBPS/AEPS, or willingness to train quickly. Ability to follow SOPs precisely without deviation. Technical (L1 level) Log reading and basic diagnosis for: .NET services IIS hosted APIs RabbitMQ dashboards/metrics (consumer lag, queue depth) Basic SQL query capability for read-only validation of transaction status. Understanding of severity/priority matrix and SLA impact. Use of monitoring tools (APM/NMS dashboards, custom switch dashboards). Behavioral
High alertness; calm under pressure in night shifts. Clear written communication, especially incident drafting. Discipline in documentation and follow-through. Education & Experience
Bachelors in Engineering/IT/CS or equivalent.
1-3 years NOC/helpdesk or transaction monitoring experience preferred. Fresher with strong aptitude may be considered with extended training. KPIs / Performance Indicators Mean Time to Detect (MTTD) SLA adherence for acknowledgment and escalation. Ticket quality score (completeness, evidence captured). False positive / missed alert rate. Shift handover quality & audit compliance. Client communication timeliness. Work Conditions
247 rotating shifts, including weekends/holidays.
On-site/remote per policy; must be reachable throughout shift. Growth Path L1 Engineer Senior L1 / Shift Lead L2 Application Support Product Champion / Ops Specialist