Lead, Site Reliability Engineer

8 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About Toyota Connected: If you want to change the way the world works, transform the automotive industry and positively impact others on a global scale, then Toyota Connected is the right place for you! Within our collaborative, fast-paced environment we focus on continual improvement and work in a highly iterative way to deliver exceptional value in the form of connected products and services that wow and delight our customers and the world around us. Come help us re-imagine what mobility can be today and for years to come! About the Team: Toyota Connected India is looking for Lead Site Reliability engineer. This team is focused on creating infotainment solutions on embedded and cloud platforms. The team members are required to be creative in solving problems, excited to work in new technology areas and be ready to wear multiple hats to get things done. This is a highly energized, fast-paced, innovative and collaborative startup environment; therefore, it is essential that not only the skillset, but also the personality matches such an environment.  

Responsibilities:

·       Assist in the design and implementation of reliable and scalable systems using Kubernetes, Docker, and Istio.·       Proactively identify performance improvements in areas such as responsiveness, availability, and scalability.·       Monitor system performance and respond to incidents as they arise, utilizing Datadog for observability.·       Help develop automation scripts for deployment and monitoring.·       Leverage GitOps to ensure that software can reliably and smoothly be shipped to production.·       Collaborate with development teams to identify and resolve reliability issues.·       Conduct load testing to verify that systems can handle expected loads for new products and updates to existing products.·       Implement A/B deployments, canary deployments, and traffic mirroring strategies to ensure critical updates go smoothly and can be rolled back easily if necessary.·       Utilize Helm charts for application deployment and management.·       Understand AWS systems, including AWS Load Balancers, EKS and routing, to support systems handling millions of requests per hour.·       Ensure that solutions are cost-effective while providing a high-quality customer experience and maintaining very high availability.·       Participate in on-call rotations and support production systems, collaborating with SREs in other parts of the world.·       Contribute to documentation and knowledge sharing within the team.·       Assist in the implementation of best practices for system reliability.

You are a successful candidate if you have

·       8+ years of experience in Site Reliability Engineering, DevOps, or a related field.·       Expertise with AWS.·       Expertise with Kubernetes, Docker, and Istio.·       Knowledge of monitoring and alerting tools, particularly Datadog, AppDynamics, ELK, Grafana, or Prometheus.·       Implement and tune Horizontal Pod Autoscalers (HPAs) to optimize resource utilization.·       Understanding of Argo CD for GitOps practices.·       Familiarity with A/B, Canary, Blue/Green deployments, and traffic mirroring techniques.·       Understanding of scripting and orchestration tools such as Terraform, Ansible, or equivalent.·       Awareness of cost management in cloud environments and the ability to balance cost with performance and reliability.·       Demonstrates advanced problem-solving, troubleshooting, decision making skills·       Ability to work independently and take ownership of tasks/assignments while driving them to completion.·       Excellent verbal and written communication skills.·       Expertise in Golang or RustWhat’s in it for you?  

  • Top-of-the-line compensation! 


  • You'll be treated like the professional we know you are and left to manage your own time and workload. 


  • Yearly gym membership reimbursement & Free catered lunches.  


  • No dress codes! We trust you are responsible enough to choose what’s appropriate to wear for the day. 


  • Opportunity to build products that improve the safety and convenience of millions of customers.  


  • New cool office space and other awesome benefits! 

  Our Core Values: EPIC Empathetic: We begin making decisions by looking at the world from the perspective of our customers, teammates, and partners. Passionate: We are here to build something great, not just for the money. We are always looking to improve the experience of our millions of customers. Innovative: We experiment with ideas to get to the best solution. Any constraint is a challenge, and we love looking for creative ways to solve them. Collaborative: When it comes to people, we think the whole is greater than its parts and that everyone has a role to play in the success! To know more about us, check out our glass door page - https://www.glassdoor.co.in/Reviews/TOYOTA-Connect 

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Rust Skills

Practice Rust coding challenges to boost your skills

Start Practicing Rust Now

RecommendedJobs for You