Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Hi,


We have an immediate requirement for HPC Team Lead position in Hyderabad with our organization SHI Locuz Enterprise Solutions Pvt Ltd.


PFB JD:


Experience - 6+years

Work location - Hyderabad


ROLE SUMMARY

Technology Lead – HPC


PRIMARY ROLES & RESPONSIBILITIES

  • Experience architecting and maintaining HPC/AI systems.
  • Linux system administration
  • Cluster management
  • System and software configuration management
  • High speed networking
  • Resource managers and schedulers
  • High speed parallel storage
  • Monitoring and alerting
  • Strong understanding of HPC/AI architectures and concepts.
  • Experience supporting and managing a group of HPC/AI Clusters.
  • Excellent knowledge in prototyping and deploying HPC/AI clusters.
  • Extensive experience in troubleshooting Linux OS, filesystems and cluster hardware.
  • Good command of various Linux scripting tools, like bash, Perl, python, etc.
  • Experience implementing, maintaining, and verifying defined security policies.
  • To be willing to maintain a flexible work schedule.
  • A positive attitude and willingness to help enable the lab users for success.
  • Excellent guidance and teamwork skills.

TECHNICAL SKILLS

  • RedHat, Ubuntu, SuSE OS
  • Cluster Tools (Bright, xCAT, werewolf, OpenHPC, ROCKS etc)
  • InfiniBand
  • Lustre, BeeGFS and GPFS architecture and maintenance
  • Configuration management software (Ansible, Puppet)
  • SLURM/PBS/LSF/Gridengine Scheduler
  • SPACK software manager
  • Experience in AI Servers & Software stack Deployment.
  • Experience on container technologies and orchestration tools - docker, singularity, Apptainer, Kubernetes.
  • Hands-on with AI/ML tools: TensorFlow, PyTorch, Keras, ONNX, JAX.
  • Experience in benchmarking and performance optimization of large-scale HPC/AI systems
  • Experience in Linux, and/or Windows Operating System (OS), including file management, scripting, editing, and security.
  • Log consolidation and monitoring (ganglia, Grafana etc.)
  • Lifecycle and patch management experience.


SOFT SKILLS

  • Good logical reasoning & analytical skill
  • Good communication skill


OTHER SKILLS

  • Collaborative, co-operative, and commitment mindset.
  • Teamwork
  • Excellent analytical and problem-solving skills.
  • Ability to work independently and within cross-functional teams.
  • Detail-oriented with good documentation practices.
  • Excellent interpersonal, communication, customer interaction, documentation skills and decision-making ability.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You