Senior High Performance Computing Engineer

4 - 8 years

6 - 8 Lacs

Posted:1 month ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

High-Performance Computing (HPC) infrastructure

Roles & Responsibilities

  • Implement and manage

    cloud-based infrastructure

    that supports

    HPC environments

    for data science (e.g., AI/ML workflows, Image Analysis).
  • Collaborate with data scientists and ML engineers to deploy scalable

    machine learning models into production

    .
  • Ensure the

    security, scalability, and reliability

    of HPC systems in the cloud.
  • Optimize cloud resources for

    cost-effective and efficient use

    .
  • Stay ahead of the curve with the latest in

    cloud services and industry standard processes

    .
  • Provide technical leadership and guidance in

    cloud and HPC systems management

    .
  • Develop and maintain

    CI/CD pipelines

    for deploying resources to multi-cloud environments.
  • Monitor and fix cluster operations/applications and cloud environments.
  • Document system design and operational procedures.

Must-Have Skills

  • Expert with

    Linux/Unix system administration

    (RHEL, CentOS, Ubuntu, etc.).
  • Proficiency with

    job scheduling and resource management tools

    (SLURM, PBS, LSF, etc.).
  • Good understanding of

    parallel computing, MPI, OpenMP, and GPU acceleration

    (CUDA, ROCm).
  • Knowledge of

    storage architectures and distributed file systems

    (Lustre, GPFS, Ceph).
  • Experience with

    containerization technologies

    (Singularity, Docker) and cloud-based HPC solutions.
  • Expert in

    scripting languages

    (Python, Bash) and containerization technologies (Docker, Kubernetes).
  • Familiarity with

    automation tools

    (Ansible, Puppet, Chef) for system provisioning and maintenance.
  • Understanding of

    networking protocols, high-speed interconnects, and security best practices

    .
  • Demonstrable experience in

    cloud computing

    (AWS, Azure, GCP) and cloud architecture.
  • Experience with

    infrastructure as code (IaC) tools

    like Terraform or CloudFormation and Git.

What We Expect of You

expert knowledge in large Linux environments, networking, storage, and cloud-related technologies

Good-to-Have Skills

  • Experience with

    Kubernetes (EKS)

    and service mesh architectures.
  • Knowledge of

    AWS Lambda and event-driven architectures

    .
  • Familiarity with

    AWS CDK, Ansible, or Packer

    for cloud automation.
  • Exposure to

    multi-cloud environments

    (Azure, GCP).

Basic Qualifications

  • Bachelor's degree in computer science, IT, or a related field with

    6-8 years of hands-on HPC administration

    or a related field.

Professional Certifications (Preferred)

  • Red Hat Certified Engineer (RHCE) or Linux Professional Institute Certification (LPIC)
  • AWS Certified Solutions Architect - Associate or Professional

Soft Skills

  • Strong analytical and problem-solving skills.
  • Ability to work effectively with global, virtual teams.
  • Effective communication and collaboration with cross-functional teams.
  • Ability to work in a fast-paced, cloud-first environment.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

Hyderabad / Secunderabad, Telangana, Telangana, India