Senior Site Reliability Engineer (Observability)

5 - 9 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Senior Site Reliability Engineer (SRE) on the SRE Observability team at Cvent, you will play a crucial role in helping the organization achieve its reliability goals by combining your development and operations knowledge and skills. Your responsibilities will encompass a wide array of technical and process-related challenges in a collaborative and distributed team environment. Collaboration with product development teams, Cloud Infrastructure, and other SRE teams will be essential to ensure effective observability and improved reliability of Cvent products, SLDC, and Infrastructure. Your expertise in cloud operations, software development lifecycle, and Observability tools will be instrumental in addressing complex multi-disciplinary problems. Implementing SRE principles such as blameless postmortems and automation will be key to continuously enhancing knowledge and maintaining a high quality of work life. **Key Responsibilities:** - Enlighten, Enable, and Empower various multi-disciplinary teams across different applications and locations. - Address intricate development, automation, and business process challenges while advocating Cvent standards and best practices. - Ensure the scalability, performance, and resilience of Cvent products and processes. - Collaborate with product development teams, Cloud Automation, and other SRE teams to identify and resolve observability gaps effectively. - Recognize recurring issues and anti-patterns in development, operational, and security processes and assist in building observability solutions. - Develop automation for build, test, and deployment targeting multiple on-premises and AWS regions. - Contribute to Open-Source projects by actively working on them. **Qualifications Required:** **Must have skills:** - Strong communication skills with a proven track record of working in distributed teams. - Passion for enhancing the work environment for colleagues. - Experience in managing AWS services and operational knowledge of applications in AWS through automation. - Proficiency in scripting languages like Typescript, Javascript, Python, Ruby, and Bash. - Familiarity with SDLC methodologies, especially Agile. - Expertise in Observability (Logging, Metrics, Tracing) and SLI/SLO. - Experience with APM, monitoring, and logging tools such as Datadog, New Relic, Splunk. - Understanding of containerization concepts like docker, ECS, EKS, Kubernetes. - Self-motivated with the ability to work independently. - Proficient in troubleshooting incidents and setting standards to prevent future issues. **Good to have skills:** - Experience with Infrastructure as Code (IaC) tools like CloudFormation, CDK (preferred), and Terraform. - Proficiency in managing 3-tier application stacks. - Knowledge of basic networking concepts. - Experience in server configuration using tools like Chef, Puppet, Ansible, etc. - Working experience with NoSQL databases such as MongoDB, Couchbase, Postgres. - Utilize APM data for troubleshooting and identifying performance bottlenecks. As a Senior Site Reliability Engineer (SRE) on the SRE Observability team at Cvent, you will play a crucial role in helping the organization achieve its reliability goals by combining your development and operations knowledge and skills. Your responsibilities will encompass a wide array of technical and process-related challenges in a collaborative and distributed team environment. Collaboration with product development teams, Cloud Infrastructure, and other SRE teams will be essential to ensure effective observability and improved reliability of Cvent products, SLDC, and Infrastructure. Your expertise in cloud operations, software development lifecycle, and Observability tools will be instrumental in addressing complex multi-disciplinary problems. Implementing SRE principles such as blameless postmortems and automation will be key to continuously enhancing knowledge and maintaining a high quality of work life. **Key Responsibilities:** - Enlighten, Enable, and Empower various multi-disciplinary teams across different applications and locations. - Address intricate development, automation, and business process challenges while advocating Cvent standards and best practices. - Ensure the scalability, performance, and resilience of Cvent products and processes. - Collaborate with product development teams, Cloud Automation, and other SRE teams to identify and resolve observability gaps effectively. - Recognize recurring issues and anti-patterns in development, operational, and security processes and assist in building observability solutions. - Develop automation for build, test, and deployment targeting multiple on-premises and AWS regions. - Contribute to Open-Source projects by actively working on them. **Qualifications Required:** **Must have skills:** - Strong communication skills with a proven track record of working in distributed teams. - Passion for enhancing the work environment for colleagues. - Experience in m

Mock Interview

Practice Video Interview with JobPe AI

Start JavaScript Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Javascript Skills

Practice Javascript coding challenges to boost your skills

Start Practicing Javascript Now

RecommendedJobs for You