Cloud Reliability Engineer- Ops & Automation

3 - 7 years

0 Lacs

Posted:15 hours ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Cloud Reliability Engineer at our company, you will be responsible for maintaining the reliability and availability of our cloud-based infrastructure. Your role will involve working in shifts and handling on-call duties to ensure smooth cloud operations, incident management, change requests, and automation tasks to improve efficiency. **Key Responsibilities:** - Monitor and manage cloud-based infrastructure to ensure high availability, performance, and security. - Respond to alerts and incidents swiftly, troubleshooting and resolving issues to minimize downtime. - Perform root cause analysis and post-incident reviews to enhance system reliability. - Handle change requests within SLAs for seamless updates to the production environment. - Participate in shift-based schedule and on-call rotation to support critical infrastructure. - Collaborate with Engineering and Field teams to resolve service requests promptly. - Automate routine operational tasks to reduce manual interventions and operational toil. - Identify opportunities for further automation in cloud operations and implement solutions. - Assist in optimizing and maintaining monitoring and alerting systems for cloud environments. **Qualifications Required:** - Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent work experience. - 3-5 years of experience in cloud operations, system administration, or related fields. - Familiarity with cloud platforms like AWS, GCP, or Azure. - Experience in automating operational tasks using scripting languages (e.g., Python, Bash). - Strong problem-solving skills, especially in managing incidents under pressure. - Understanding of ITIL processes, incident and change management. - Familiarity with monitoring tools and incident management platforms. - Proactive mindset towards improving operational processes through automation. **Additional Details about the Company:** ThoughtSpot is the experience layer of the modern data stack, leading the industry with AI-powered analytics and natural language search. The company promotes a diverse and inclusive work culture, emphasizing a balance-for-the-better philosophy and a culture of Selfless Excellence. ThoughtSpot encourages continuous improvement and celebrates diverse communities to empower every employee to bring their authentic self to work. If you are excited to work with bright minds, contribute to innovative solutions, and be part of a respectful and inclusive company culture, we invite you to explore more about our mission and apply for the role that suits you best. As a Cloud Reliability Engineer at our company, you will be responsible for maintaining the reliability and availability of our cloud-based infrastructure. Your role will involve working in shifts and handling on-call duties to ensure smooth cloud operations, incident management, change requests, and automation tasks to improve efficiency. **Key Responsibilities:** - Monitor and manage cloud-based infrastructure to ensure high availability, performance, and security. - Respond to alerts and incidents swiftly, troubleshooting and resolving issues to minimize downtime. - Perform root cause analysis and post-incident reviews to enhance system reliability. - Handle change requests within SLAs for seamless updates to the production environment. - Participate in shift-based schedule and on-call rotation to support critical infrastructure. - Collaborate with Engineering and Field teams to resolve service requests promptly. - Automate routine operational tasks to reduce manual interventions and operational toil. - Identify opportunities for further automation in cloud operations and implement solutions. - Assist in optimizing and maintaining monitoring and alerting systems for cloud environments. **Qualifications Required:** - Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent work experience. - 3-5 years of experience in cloud operations, system administration, or related fields. - Familiarity with cloud platforms like AWS, GCP, or Azure. - Experience in automating operational tasks using scripting languages (e.g., Python, Bash). - Strong problem-solving skills, especially in managing incidents under pressure. - Understanding of ITIL processes, incident and change management. - Familiarity with monitoring tools and incident management platforms. - Proactive mindset towards improving operational processes through automation. **Additional Details about the Company:** ThoughtSpot is the experience layer of the modern data stack, leading the industry with AI-powered analytics and natural language search. The company promotes a diverse and inclusive work culture, emphasizing a balance-for-the-better philosophy and a culture of Selfless Excellence. ThoughtSpot encourages continuous improvement and celebrates diverse communities to empower every employee to bring their authentic self to work. If you are excited to work with

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You