Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Hyderabad
>
Cognizant
>
Mainframe-SRE-Z/os

Mainframe-SRE-Z/os

Cognizant

10 - 17 years

22 - 30 Lacs

Hyderabad Pune

Posted:5 months ago| Platform:

Apply

Skills Required

Site Reliability Engineering Mainframes Zos

Work Mode

Hybrid

Job Type

Full Time

Job Description

Role & responsibilities Job Title: Mainframe Site Reliability Engineer (SRE) Location: Pune/Hyd Employment Type: Full-Time --- About the Role We are seeking a visionary Mainframe Site Reliability Engineer (SRE) to redefine the reliability, automation, and efficiency of our mission-critical z/OS systems. This role combines deep mainframe expertise with cutting-edge SRE practices, focusing on innovations in observability, AI-driven operations, and DevOps integration to transform legacy workflows into modern, self-healing systems. You will drive initiatives to eliminate manual toil, optimize performance, and ensure the platforms resilience aligns with business-critical service level objectives (SLOs). --- Key Responsibilities 1. SRE-Centric Innovation & Automation - Automation Engineering: - Design and deploy Infrastructure-as-Code (IaC) solutions using Ansible, Zowe CLI, and z/OSMF workflows to automate system provisioning, configuration management, and recovery processes. - Develop self-healing workflows for critical subsystems (CICS, Db2, IMS) to auto-resolve incidents like JVM failures or transaction bottlenecks. - Convert legacy operational scripts (REXX, NCL) into modern, version-controlled pipelines integrated with Git and CI/CD tools like Jenkins. - AI-Driven Observability: - Implement predictive analytics tools (e.g., IBM Watson AIOps, Splunk ITSI) to detect anomalies in system metrics, logs, and message queues. - Build dashboards using Grafana or Prometheus to visualize the Four Golden Signals (latency, traffic, errors, saturation) across mainframe workloads. - Centralize alert management to reduce noise and prioritize actionable alerts using AI-driven correlation. 2. DevOps Integration & Modernization - CI/CD for Mainframe: - Streamline software delivery pipelines for COBOL/PL/I applications using IBM Dependency-Based Build (DBB) and UrbanCode Deploy (UCD). - Integrate mainframe SDLC processes with enterprise Git repositories (GitHub, GitLab) to enable collaborative development and audit trails. - Enable automated testing and phased rollouts for z/OS middleware updates. - Performance & Capacity Engineering: - Optimize CPU/MIPS utilization through runtime tuning (e.g., CICS Threadsafe, AT-TLS offloading) to reduce software licensing costs. - Forecast capacity demands using historical SMF/RMF data and propose dynamic hardware scaling strategies. - Conduct load testing for batch and OLTP workloads to validate system limits and error budgets. 3. Incident Management & Reliability - Lead blameless postmortems for critical incidents, focusing on root cause analysis (RCA) and preventive actions (e.g., monitoring gaps, automation fixes). - Reduce MTTR by implementing automated incident response playbooks (e.g., auto-restart failed subsystems, reroute traffic). - Maintain 24/7 operational readiness through on-call rotations and cross-training in z/OS, CICS, Db2, and storage management. 4. Platform Hardening & Knowledge Sharing - Enforce security best practices (RACF, TLS) and vulnerability remediation for z/OS and middleware. - Develop reusable workbooks and runbooks to document system configurations, troubleshooting steps, and automation workflows. - Mentor teams on SRE principles, fostering a T-shaped skill model (deep mainframe + DevOps/Agile practices). 5. Batch Optimization & Resource Management - Design dynamic resource allocation strategies (e.g., WLM policies, enclaves) to prioritize critical batch jobs and minimize contention for CPU, memory, and I/O resources. - Implement parallel processing (e.g., multi-task JCL, SYSAFF routing) to reduce runtime and avoid bottlenecks in long-running batch cycles. - Streamline job dependencies using graph-based scheduling tools (e.g., IWS, CA7, Control-M ) to eliminate idle wait times between interdependent jobs. 6. Proactive Batch Health Monitoring : - Develop automated checks for batch job SLAs , including real-time alerts for delays, resource starvation, or dataset contention. - Integrate predictive analytics (e.g., historical SMF data analysis) to forecast and mitigate delays caused by seasonal peaks or data volume spikes. --- Required Skills - Technical Expertise: - xx+ years in z/OS system programming, performance tuning, or infrastructure support. - Proficiency in JCL, REXX, Python, and mainframe automation tools (IBM Z System Automation, Broadcom OPS/MVS). - Hands-on experience with Zowe, Ansible, Git, and CI/CD pipelines. - Mastery of SRE tenets: SLOs/SLIs, error budgets, and Infrastructure-as-Code (IaC). - Innovation Focus: - Proven track record in implementing AI/ML-driven monitoring or auto-remediation for mainframe environments. - Experience modernizing legacy workflows (e.g., replacing CA Endevor with Git-based SDLC). - Soft Skills: - Ability to lead cross-functional teams during high-severity incidents. - Strong communication to align technical execution with business objectives. - Education: - Bachelor’s degree in Computer Science, Engineering, or related field. --- Preferred Qualifications - Experience with AI-Driven Automation platforms (e.g. AMELIA AIOps) to standardize and migrate legacy workflows, integrate with event management systems (e.g., BigPanda), and orchestrate ITIL processes (Incident, changes) via ServiceNow - Certifications: IBM z/OS System Programming, Broadcom Mainframe SRE, or Hashicorp Terraform. - Familiarity with Zowe Desktop for modern IDE-driven development or Dynatrace APM for CICS/Db2 monitoring. - Knowledge of mainframe open-source ecosystems (Zowe, Feilong) or hybrid-cloud integrations.

More Jobs at Cognizant

Workday: Walk-in drive 10th May Pune with CTS(Full time)

Pune, Mumbai (All Areas)

3 - 8 yrs

INR 7 - 17 Lacs

Java Full Stack Developer

Chennai

5 - 10 yrs

INR 10 - 20 Lacs

Walkin For Freshers @hyderabad

Hyderabad

Experience: Not specified

INR 0 - 2 Lacs

React JS Developer

Hyderabad

2 - 7 yrs

INR 9 - 14 Lacs

Java Developer

Bengaluru

5 - 10 yrs

INR 15 - 30 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Cognizant

IT Services and IT Consulting

Teaneck New Jersey

Login to

Please Verify Your Phone or Email

Confirm Action

Mainframe-SRE-Z/os