Site Reliability Engineer, Platform Engineering

0 years

0 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

Position Description

Tesla's Platform Engineering is looking for a Site Reliability Engineer to join our team. As a member of the team, you will be building and maintaining Kubernetes clusters using infrastructure-as-code tools like Ansible, Terraform, ArgoCD and Helm and helping the application teams to be successful on our platform. The underlying infrastructure is a mix of on-premise VMs, bare metal hosts and public clouds such as AWS located all around the globe, which presents unique challenges and opportunity to work with different types of infrastructure technologies. A successful candidate will be expected to possess expert knowledge in Linux fundamentals, architecture and performance tuning; as well as software development skills to match. Experience running Kubernetes in production will be a strong plus; we prefer Golang or Python for any automation or tools we have to build along the way. We are the team that runs production critical workloads for every aspect of the business at Tesla and sets the standards for other teams, a group of well-rounded generalists that not only solve the hardest problems in the industry but also push other Engineering teams at large to be better. Join us to get a chance to work with some of the best Engineers in the industry for one of the most transformative companies in the history of both automotive and energy industries.

Responsibilities

  • Hands-on with developers to deploy the applications to provide support
  • Building new features to improve the platform in terms of stability & updates
  • Manage our Kubernetes clusters on-prem and in the cloud to support our growing workloads
  • Participating in the architecture design process and troubleshooting of live applications with the product teams
  • Participating in a 24x7 on-call rotation
  • Influence architectural decisions with focus on security, scalability and high-performance
  • Setup and maintain monitoring, metrics & reporting systems for fine-grained observability and actionable alerting
  • Authoring technical documentation for workflows/processes/best practices
  • Requirements

  • Experience managing web-scale infrastructure in a production *nix environment
  • Ability to prioritize tasks and work independently with an analytical mind with a bias for action
  • Advanced or expert-level Linux administration and performance tuning skills
  • Bachelor's Degree in Computer Science, Computer Engineering, or equivalent experience or evidence of exceptional ability
  • Advanced experience with configuration management systems such as Ansible, Terraform or Puppet
  • Demonstrable knowledge of the Linux operating system internals, networking stack, filesystems, resource scheduling and process management
  • Exposure to AWS, or other cloud infrastructure providers
  • Experience managing container-based workloads, using Kubernetes or other orchestration software in production (ArgoCD, Helm)
  • Proficiency in a high-level language like Python, Go, Ruby and/or Java
  • Mock Interview

    Practice Video Interview with JobPe AI

    Start Job-Specific Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Skills

    Practice coding challenges to boost your skills

    Start Practicing Now

    RecommendedJobs for You