Posted:6 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Company Description

Concept Information Technologies (I) Pvt. Ltd., headquartered in Pune, is a leading IT solutions and system integration partner. We deliver scalable and cost-effective solutions in High-Performance Computing, Disaster Recovery, Enterprise Networking, Cybersecurity and Software Development. Backed by partnerships with HPE, Cisco, IBM and others, we combine industry expertise with advanced technology to drive measurable business value.


Location:


Role Description

experienced HPC Administrator


Key Responsibilities:

  • Design, deploy, and maintain HPC clusters and supporting infrastructure.
  • Manage schedulers such as

    SLURM

    ,

    PBS Pro

    , or

    LSF

    .
  • Manage job queues, partitions, and scheduling policies to ensure efficient workload distribution.
  • Support user issues related to job submissions, resource requests, and job scripts.
  • Proficient with Docker, container technologies, Kubernetes
  • Maintain containerized environments using Docker and Enroot.
  • Optimize cluster performance and resource utilization.
  • Develop and manage workflows for submitting container-based jobs to compute clusters using SLURM or similar job schedulers.
  • Demonstrated proficiency in support and troubleshooting of 3rd party HPC software
  • Compiling and deploying open source software and software.
  • Integrate GPU-based workloads with

    SLURM

    ,

    PBS

    , or similar job scheduling systems.
  • Understanding of MPI, Intel MPI
  • Understanding of different User authentication methods like, IPA/IDM, NIS, LDAP
  • Expert knowledge of related parallel distributed file system like Lustre/IBM GPFS/BGFS,
  • Implement backup, disaster recovery, and monitoring solutions.
  • Ability to deploy open-source and commercial HPC Platforms,
  • Support application teams with MPI libraries, parallel processing, and GPU setups.
  • Automate repetitive tasks through scripting.
  • Create and maintain detailed technical documentation.
  • Mentor junior team members and collaborate on solution design.


Required Skills:

  • 7-8 years of experience in

    HPC administration or Linux systems engineering

    .
  • Strong expertise in

    cluster management, tuning, and performance optimization

    .
  • Experience with

    storage systems

    ,

    networking (InfiniBand, Ethernet)

    and

    monitoring tools

    (Grafana, Prometheus).
  • Proficiency in

    shell scripting

    ,

    Python

    , or automation tools.
  • Knowledge of

    MPI

    ,

    CUDA

    , or

    GPU computing

    is an advantage.
  • Excellent troubleshooting and communication skills.


Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You