ML Platform Engineer

5 years

0 Lacs

Posted:5 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Company Description

North Hires is a leading consulting firm specializing in Custom Software Development, Recruitment, and Executive Search services. With operations across the USA, UK, India, and EMEA, we connect top talent with distinguished organizations. Our experienced team offers tailored solutions to meet specific client needs, leveraging deep industry expertise and an extensive professional network. Committed to empowering businesses, North Hires supports growth, innovation, and success by sourcing the finest talent and providing exceptional consulting partnerships.


Role Description

We are seeking a full-time ML Platform Engineer to join our team at our Bengaluru office in an on-site capacity. In this role, you will design, develop, and optimize machine learning platforms and infrastructure, ensuring robust and scalable solutions. You will troubleshoot technical challenges, collaborate with cross-functional teams for seamless integration, and maintain database and software systems to support machine learning deployments.


Qualifications

  • Degree in Computer Science, Engineering, or a related technical field, or equivalent professional experience.
  • 5+ years in ML operations, DevOps, platform support, or similar fields handling distributed AI systems.
  • Experience offering L1/L2 support and on-call coverage for Ray-based and Kubernetes-based environments running ML workloads.
  • Deep understanding of Ray operations, including job orchestration, scaling behavior, and scheduling across CPU/GPU infrastructure.
  • Strong practical knowledge of Kubernetes internals—control plane, data plane, RBAC, namespaces, ingress, and resource segmentation.
  • Expertise in GPU scheduling and optimization, NVIDIA tooling, and related compute frameworks.
  • Proficiency in Python or Go for building automation tools and debugging distributed systems.
  • Familiarity with cloud-native environments on AWS, Azure, or GCP, as well as CI/CD practices.
  • Hands-on experience with metrics, tracing, and alerting solutions such as Prometheus, Grafana, and OpenTelemetry, plus incident-management platforms like PagerDuty or ServiceNow.
  • Understanding of ML frameworks (e.g., PyTorch, TensorFlow) and how they behave in Ray/Kubernetes-based distributed setups.
  • Strong problem-solving and communication skills, with the ability to work effectively across engineering and research groups.
  • Disciplined operational mindset focused on reliability, performance, and user experience.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

bengaluru, karnataka, india

bengaluru, karnataka, india

chennai, tamil nadu, india