Site Reliability Engineer

0 years

0 Lacs

Posted:2 months ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About Builder.ai We're on a mission to make software development building so easy everyone can do it - regardless of their background, tech knowledge or budget. We've already helped thousands of entrepreneurs, small businesses and even global brands, like the BBC, Makro and Pepsi achieve their software goals and we've only just started. With a truly global footprint encompassing offices across EMEA, APAC, and the Americas, Builder.ai is driving innovation on a worldwide scale. Having secured over $450 million in funding to date, supported by prominent investors including QIA and Microsoft, the opportunity to join Builder.ai has never been more exciting. Life at Builder.ai At Builder.ai we encourage you to experiment! Each role at Builder has unlimited opportunities to learn, progress and challenge the status quo. We want you to help us become even better at supporting our customers and take software development building to new heights. Our global team is diverse, collaborative and exceptionally talented. We hire people for their differences but all unite with our shared belief in Builder's mission to unlock human potential through the power of software. In return for your skills and commitment, we offer a range of great perks, from private healthcare and discretionary variable pay or commission scheme, to employee stock options, generous paid leave, and trips abroad #WhatWillYouBuild About The Role We are looking for an engineer to help us achieve and maintain maximum potential in areas of scale, security, fiscal efficiency and fault tolerance. This includes everything from authentication to authorisation service to programmatic cross cloud resource initiation, orchestration and continuity. Why You Should Join This is a challenging and diverse role that requires you to build to grow and build the capability of an existing team. Only join this role if you are craving rapid growth, able to create a path in unchartered territory, and comfortable with trying, failing and course-correcting fast. You'll be responsible for Service Reliability & Availability: Take ownership of the reliability, availability, and performance of key services and infrastructure components. Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Incident Management: Lead and participate in incident response, root cause analysis (RCA), and post-mortem processes to minimize downtime and prevent recurrence. Drive the implementation of corrective and preventative actions. Problem Management: Identify recurring incidents and underlying problems, and drive proactive solutions to improve system stability and prevent future issues. Automation & Tooling: Design, develop, and implement automation tools and scripts to streamline operational tasks, improve efficiency, and reduce manual intervention. Champion Infrastructure as Code (IaC) and Configuration as Code (CaC) practices. Performance Optimization: Identify performance bottlenecks, conduct capacity planning, and implement optimizations to ensure systems can handle current and future demands. Monitoring & Observability: Design, implement, and maintain comprehensive monitoring, logging, and alerting systems to provide real-time visibility into system health and performance. Capacity Planning: Collaborate with development and product teams to forecast capacity needs and plan infrastructure upgrades and scaling activities. Security & Compliance: Integrate security best practices into system design and operations. Ensure compliance with relevant industry standards and regulations. On-Call Responsibilities: Participate in an on-call rotation to provide timely support for critical incidents. Documentation: Create and maintain comprehensive documentation for systems, processes, and procedures. Requirements Cloud Knowledge: You have experience with cloud platforms such as AWS, GCP, or Azure. Coding Skills: You are proficient in at least one programming language such as Python, Go, or similar, and have experience with automation tool Strong understanding of Linux/Unix operating systems and networking principles. Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, Helm). Solid understanding of containerization technologies (e.g., Docker, Kubernetes) and orchestration platforms. Solid experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). Benefits Attractive quarterly OKR bonus plan or commission scheme dependant on your role Stock options in a $450 million funded Series D scale-up company 24 days annual leave + public holidays 2 x Builder family days each year Time off between Christmas and New Year Generous Referral Bonus scheme Fully funded Private Medical Insurance Free lunch at our state of the art working environment in Gurugram Show more Show less

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Noida, Uttar Pradesh, India

Bengaluru, Karnataka, India

Hyderabad, Telangana, India

Hyderabad, Telangana, India

Bengaluru, Karnataka, India