Sr Manager, Machine Learning Engineering

10 - 13 years

10 - 13 Lacs

Posted:1 month ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking an experienced and visionary Sr Engineering Manager to lead our AI Platform development and SRE team in Noida. In this role, you will be responsible for leading all aspects of the development and deployment of robust cloud solutions, with a focus on ML/AI-powered products. You will guide a team of engineers, ensuring that our products are reliable, scalable, and high-performing, while fostering a collaborative and innovative environment.

Responsibilities-

  • Team Leadership & Management:

  • Lead, mentor, and inspire a team of engineers, fostering professional growth and ensuring a high-performance culture.
  • Provide clear direction, set objectives, and manage project timelines to ensure timely and high-quality product releases.
  • Strategic Technical Direction:

  • Lead all aspects of the architecture, deployment, and maintenance of a customised Kubernetes clusters in alignment with organisational standards.
  • Guide the team in designing and implementing the training platform control plane, including job scheduling, scaling, and in-cluster components.
  • Process & Infrastructure Management:

  • Ensure robust automation through the development and maintenance of scripts and tools for provisioning and configuring Kubernetes infrastructure based on infrastructure-as-code principles.
  • Establish and supervise key performance indicators (KPIs), implementing monitoring tools and alerting mechanisms to ensure system health and performance.
  • Collaboration & Partner Management:

  • Act as a bridge between multi-functional teams, facilitating communication and integration for application deployment and Kubernetes infrastructure.
  • Collaborate with senior leadership to align technical strategy with business goals and drive continuous improvement.
  • Risk Management & Compliance:

  • Lead all aspects of the implementation of security measures to protect Kubernetes clusters and ensure compliance with industry standards.
  • Lead initiatives to investigate and resolve technical issues, optimise performance bottlenecks, and plan for capacity requirements.
  • Innovation & Continuous Improvement:

  • Stay informed about the latest progress in Kubernetes and cloud technologies, integrating standard methodologies into team processes.
  • Champion initiatives for disaster recovery and business continuity planning.

Requirements

  • Bachelor's/Master's degree in Computer Science (B.Tech/M.Tech) from a premier institute.
  • 10+ years of experience in software development and operations, with at least 3 years in a managerial or leadership role.
  • Strong foundational knowledge of computer science, including architecture, design, and performance optimisation.
  • Extensive experience with Cloud Platforms (preferably AWS) and hands-on technical expertise in Kubernetes operations.
  • Proficiency in Python/Go and Shell scripting, with a strong record of writing reliable, maintainable code.
  • Excellent problem-solving skills, with the ability to work independently while effectively running a technical team.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Adobe logo
Adobe

Software Development

San Jose CA

RecommendedJobs for You