Service Reliability Engineer

16 years

3 - 4 Lacs

Hyderābād

Posted:2 days ago| Platform:

Apply

Skills Required

service reliability monitoring server network storage management checks linux analysis reporting tuning configuration backup documentation reports escalation report communication ubuntu troubleshooting tracking networking apache nginx ftp aws docker support development diversity

Work Mode

On-site

Job Type

Full Time

Job Description

Job Requirements Service Reliability Engineer I Monitoring Team with L1 Resources for all domains to cover 24x7 IT Environment (Server, Network, Application, Storage and Database) Management of alerts raised by infrastructure elements Management of alerts raised by Application Services Perform daily health checks (Network, Servers & Datacenter) Knowledge of Windows, Linux & Network Infrastructure Perform operations based on the documented procedures Assist in the analysis of the reporting and alerts raised by various infrastructure devices Fine-tuning of configuration to maintain the performance and functionality of the monitoring solutions in place. Managing Incidents Roles & Responsibilities: 24x7 proactive monitoring of server, storage, backup and network environment alerts via monitoring tool and Email Escalations and follow-up with the IT System Admin team as well as the specific application team on pending high-priority trouble tickets Prepare and maintain Documentation, Reports, and provide follow-up status on identified tasks On time Escalation and Reporting of alerts according to the Incident Management process Daily / Weekly Report preparation based on the specified already agreed format, and sending the same to the pre-assigned set of recipients Sending the reports on the specified time and day, and informing the concerned recipients in terms of any delays due to any dependencies Escalate the incidents based on the standard procedure and run-down follow-up reporting per team and area. Escalate incidents till closure Maintain, update and implement the standard escalation procedures, complete with notification matrix and escalation standards Work Experience Good Communication Skills Strong Linux administration skills in various flavors (CentOS, Ubuntu and Red Hat). Troubleshooting skills in Booting Problems Should have an understanding of the Incident management process(ITIL). Good skills in incident tracking from Logs. Good Skills in Shell Scripting Networking Skills Knowledge of Web servers (Apache, Nginx, etc.) File servers like FTP, NFS and SAMBA Additional Advantage: AWS &Azure Cloud knowledge, Docker, Jenkins Education: B.Tech or 16+ years of full-time education. Work Experience Benefits We want you to be your best self and to pursue your passions! Health and wellness benefits/programs to support holistic employee health Flexible hours and working schedules, as well as parental leave for new parents Growing organization with career pathing and development opportunities Tons of perks and extras in every location for all Phenoms! Diversity, Equity, & Inclusion: Our commitment to diversity runs deep! Diversity is essential to building phenomenal teams, products, and customer experiences. Phenom is proud to be an equal opportunity employer taking collective action to build a more inclusive environment where every candidate and employee feels welcomed. We recognize there is more to be done. Our teams are committed to continuous improvement until these powerful ideas are ingrained in our culture for Phenom and employers everywhere

Mock Interview

Practice Video Interview with JobPe AI

Start Service Interview Now
Phenom People
Phenom People

Software Development

Ambler PA

1001-5000 Employees

59 Jobs

    Key People

  • Mahe Bayireddi

    Co-Founder & CEO
  • Amit Saini

    Co-Founder & CTO

RecommendedJobs for You