AD-DDIT IES cloud Engineering

8 - 13 years

60 - 65 Lacs

Posted:6 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role & responsibilities

Associate Director DDIT IES Cloud Engineering

Job Level: 5

Reports to: Director DDIT IES Cloud Engineering

ROLE PURPOSE

Responsible for designing, building, and managing a cutting-edge AI and Generative AI infrastructure based on NVIDIA SuperPOD NV72 system, tailored for pharmaceutical business use cases. The platform will enable Biomedical Research Scientists and other business users to accelerate early molecule development and research activities by providing robust, scalable, and secure GPU computing resources.

MAJOR ACCOUNTABILITIES

  • Architect and Design: Lead the design and architecture of an NVIDIA SuperPOD-based AI infrastructure platform supporting Generative AI workloads and advanced analytics for pharma use cases like BioNeMo, AlphaFold, ESMFold, OpenFold, ProtGPT2, and NVIDIA Clara suite.
  • Platform Development: Implement ML/Ops solutions (Run:AI) on Kubernetes clusters optimized for NVIDIA GPUs.
  • Data Management: Design and implement high-performance data pipelines for large-scale genomics and chemical compound datasets.
  • Security and Compliance: Ensure robust security measures and compliance for HPC and multi-cloud environments.
  • Performance Optimization: Optimize GPU cluster performance, networking, and storage for cost-efficiency and scalability.
  • Innovation: Stay updated with NVIDIA AI infrastructure advancements and HPC trends.

TECHNICAL EXPERTIES

  • Expertise in deploying and managing

    GBX00 GPU-based clusters

    .
  • Understanding of advanced interconnect technologies for GB-series GPUs.
  • Performance tuning for

    multi-node GBX00 workloads

    using NCCL, CUDA NVLink, NVSwitch, Storage and Inband High-Speed Ethernet Fabric, RDMA tuning, QoS policies, Out of Band Management.
  • Redundant power and cooling systems for HPC reliability.
  • Cluster Management: NVIDIA Base Command Manager, Slurm, Kubernetes for GPU scheduling.
  • Firmware & Driver Management: CUDA, NCCL, InfiniBand drivers, GPU firmware updates.
  • EFA, NVLink and InfiniBand switches for ultra-low latency GPU cluster communication.
  • Separate Ethernet-based management network for orchestration and monitoring.
  • Parallel File Systems: Spectrum Scale (GPFS) or Lustre for high-performance distributed storage.
  • Multi-petabyte capacity with NVMe SSD tiers for scratch space and HDD tiers for archival.
  • Integration with object storage for AI datasets.
  • Monitoring & Troubleshooting: DCGM, Prometheus, Grafana for telemetry and health checks.
  • Security & Compliance: RBAC, encryption, secure multi-tenant configurations.
  • Al/ML Workflow optimization, troubleshooting and job scheduling

QUALIFICATIONS

  • Bachelor’s degree in IT, Computer Science, or Engineering.
  • 8+ years of experience in GPU-based AI infrastructure and HPC systems.
  • Deep expertise in NVIDIA DGX systems and SuperPOD architecture.
  • Strong knowledge of containerization (Docker, Kubernetes) and DevOps practices.
  • Excellent collaboration and documentation skills.

KEY PERFORMANCE INDICATORS

  • On-time delivery of NVIDIA SuperPOD infrastructure.
  • SLA adherence for AI workloads.
  • Cost optimization and performance benchmarks.
  • Successful onboarding of pharma AI use cases.

Preferred candidate profile

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
NOVARTIS logo
NOVARTIS

Pharmaceutical Manufacturing

Basel Baselstadt

RecommendedJobs for You