Senior Linux Administrator – AI/ML & Data Center Networking

7 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Location:

Experience:

Type:


Role Overview

Senior Linux Administrator with strong Data Center Networking expertise

Linux systems, Kubernetes, and modern data center networks


Key Responsibilities

  • Deploy, configure, and manage

    on-premises Linux servers

    supporting AI/ML and GPU-accelerated workloads.

  • Design, implement, and operate

    data center networking for AI/ML infrastructure

    , including:

  • High-speed Ethernet (25G/40G/100G/400G) or InfiniBand fabrics

  • Spine-leaf architectures and low-latency network designs

  • Configure and troubleshoot

    Kubernetes networking

    , including CNI plugins (Calico, Cilium, Flannel), service networking, ingress, and network policies.

  • Optimize

    network performance, latency, and throughput

    for distributed training, storage access, and HPC workloads.

  • Work closely with network teams to integrate

    switching, routing, VLAN/VXLAN, BGP, and load balancing

    into Kubernetes and AI platforms.

  • (Desirable) Automate infrastructure and network provisioning using

    Ansible, Terraform, and scripting (Bash/Python)

    .

  • Administer and monitor

    data center components

    such as compute servers, network switches, storage systems, and virtualization platforms.

  • Troubleshoot end-to-end issues spanning

    Linux OS, Kubernetes and network layers

    .

  • Ensure

    security, segmentation, and compliance

    across compute and network environments.

  • Plan and implement

    scalable, highly available architectures

    for AI/ML platforms.

  • network diagrams, IP plans, topology maps, and runbooks


    Required Skills & Qualifications

    • 7+ years of experience

      in Linux system administration (RHEL, Ubuntu, CentOS).

    • Strong hands-on experience with

      data center networking

      , including:

    • L2/L3 networking fundamentals (VLANs, routing, BGP, VXLAN)

    • Spine-leaf architectures and modern DC network designs

    • High-bandwidth, low-latency networks for AI/HPC workloads

    • Proven experience managing

      Kubernetes clusters

      , with solid understanding of Kubernetes networking concepts.

    • Experience integrating

      compute, storage, and networking

      for large-scale on-prem or hybrid data centers.

    • Working knowledge of

      network performance tuning, packet flow, and troubleshooting tools

      (tcpdump, iperf, ethtool, etc.).

    • Experience with

      automation tools

      such as Ansible, Terraform, and CI/CD pipelines.

    • Proficiency in

      Bash and Python scripting

      .

    • Strong understanding of

      system and network performance optimization

      .

    • Excellent problem-solving and cross-team collaboration skills.

  • Preferred / Good to Have

    • Experience with

      NVIDIA GPU networking

      , GPUDirect, RDMA, or InfiniBand environments.

    • Familiarity with

      HPC and distributed AI training frameworks

      .

    • Exposure to

      data center switches

      from vendors such as Cisco, Arista, Juniper, NVIDIA (Spectrum), etc.

    • Experience with

      monitoring and observability

      tools (Prometheus, Grafana).

    • Knowledge of

      hybrid cloud networking

      and on-prem to cloud connectivity.

  • CKA (Certified Kubernetes Administrator) or networking certifications (CCNA/CCNP or equivalent) are a plus.


    Mock Interview

    Practice Video Interview with JobPe AI

    Start Python Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now