Administrator 4, Systems Administration

0 - 8 years

2 - 10 Lacs

Posted:4 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Sandisks High-Performance Computing environments are key to bringing new storage solutions to market. As a Senior High-Performance Computing (HPC) engineer in the IT Infrastructure team, you will be at the heart of Sandisk s engineering and product development process, delivering the IT HPC infrastructure and services that empowers engineering teams to develop new storage technologies and deliver high quality products to market quickly.
As a member of the HPC as a service team HPCaaS, you will be responsible for establishing and executing strategic objectives focused on improving the effective utilization of the compute resources while meeting or exceeding customer service level agreements for job prioritization, job concurrency, and job throughput in our EDA compute clusters. This includes leading architectural innovation and path finding efforts to create and implement Sandisk s next generation Grid computing environment. As a member of the team, you will be expected to not only deliver on technical requirements and solutions but also be able to present your solutions to senior management. Responsibilities include but are not limited to working as an individual contributor, a team member and a technical team lead to explore, define, and pilot new solutions with little supervision. Develop solutions, scripts, and/or processes to automate management of services and tools as required. In this role, you will be collaborating closely with EDA and hardware design team stakeholders to define and deliver workload efficiency improvements in Sandisk s EDA HPC infrastructure globally.

Role Overview:

Join our global engineering product development team to support and enhance multi-site, high-performance computing (HPC) infrastructure and services. You will design, implement, and maintain automation solutions while driving continuous improvements in performance and reliability.

Key Responsibilities:

  • Manage and support distributed HPC environments across multiple locations, focusing on ASIC and GPU computing clusters.
  • Design, deploy, and maintain Ansible automation for HPC and Unix systems.
  • Troubleshoot complex issues within HPC clusters and file systems, performing root cause analysis and driving corrective actions.
  • Develop and maintain comprehensive documentation for HPC infrastructure.
  • Identify opportunities to automate repetitive tasks and improve system reliability.
  • Recommend and implement performance enhancements for various workloads.
  • Support a broad Engineering Design Automation (EDA) ecosystem including licensing and workflow management.

Technical Environment:

  • Workload managers: LSF, Slurm, NC
  • EDA tools such as Cadence, Synopsys, and their workflows
  • Automation of job submissions and workload management
  • Monitoring and observability using Splunk and Grafana
  • Infrastructure: RedHat/CentOS Linux, NFS storage, automounters
  • VDI: Exceed TurboX, VNC
  • Unix/Linux authentication integrated with Active Directory
  • Infrastructure automation through scripting and open-source tools

Qualifications
  • Bachelor s degree in Computer Science or equivalent experience
  • 10+ years of Linux systems administration, with strong expertise in RedHat/CentOS production environments
  • Proven experience w

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Western Digital logo
Western Digital

Computer Hardware Manufacturing

San Jose CA

RecommendedJobs for You