Posted:None|
Platform:
Work from Office
Full Time
About the Role
We’re hiring a Cloud Operations Engineer to join our growing infrastructure team. In this role, you’ll be responsible for monitoring, maintaining, and responding to incidents in our production cloud environment. You will play a key role in ensuring uptime, performance, and reliability of cloud-based systems across compute, networking, and storage. This is an ideal opportunity for candidates interested in the operational side of cloud infrastructure, incident response, and systems reliability, especially those with a passion for Linux, monitoring tools, and automation.
Job Responsibilities
? Monitor health and performance of cloud infrastructure using tools like Prometheus, Grafana, ELK, and Zabbix.
? Perform L1–L2 troubleshooting of compute, network, and storage issues.
? Respond to infrastructure alerts and incidents with a sense of urgency and ownership.
? Execute standard operating procedures (SOPs) for issue mitigation and escalation.
? Contribute to writing and improving incident response playbooks and runbooks.
? Participate in root cause analysis (RCA) and post-incident reviews.
? Automate routine operations using scripting and Infrastructure - as - Code (IaC) tools.
Technical Skills
Nice to Have (Not All Required) We don’t expect you to have experience in every area. If
you’re eager to learn and have a solid foundation in Linux or cloud, you're encouraged to apply — even if you're
still gaining experience in some areas below:
? Operating Systems: Linux (Debian / Ubuntu / CentOS / Rockylinux)
? Monitoring & Logging: Prometheus, Grafana, ELK, Zabbix, Nagios
? Infrastructure Troubleshooting Tools: top, htop, netstat, iostat, tcpdump
? Networking: DNS, NAT, VPN, Load Balancers
? Cloud Services: VM provisioning, disk management, firewall rules
? Automation & Scripting: Bash, Python, Git
? IaC Tools: Ansible, Terraform (good to have)
? Incident Response & RCA: Familiarity with escalation procedures and documentation best practices
You Should Be Someone Who:
? Pays strong attention to detail and can respond under pressure
? Has solid analytical and troubleshooting skills
? Is comfortable working in shifts and taking ownership of incidents
? Communicates clearly and collaborates well with cross-functional teams
? Is eager to learn cloud automation, reliability, and monitoring practices
What You’ll Gain
? Hands-on experience in live cloud infrastructure operations
? Expertise in monitoring tools, alert handling, and system troubleshooting
? Real-world experience with DevOps practices, SOPs, and RCA processes
? Exposure to automation and Infrastructure-as-Code workflows
E2E Networks
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
kanchipuram
9.0 - 13.0 Lacs P.A.
bengaluru
10.0 - 11.0 Lacs P.A.
bengaluru
40.0 - 50.0 Lacs P.A.
bengaluru
30.0 - 35.0 Lacs P.A.
6.0 - 10.0 Lacs P.A.
bengaluru
2.0 - 6.0 Lacs P.A.
pune
10.0 - 14.0 Lacs P.A.
2.0 - 5.0 Lacs P.A.
kolkata
5.0 - 10.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.