Principal Engineer, Data Engineering

8 - 13 years

10 - 15 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Description
  1. System Design and Deployment:

    • Designing and deploying high-performance computing clusters and systems based on organizational requirements and industry best practices.
    • Configuring hardware components, network and storage systems to optimize performance and reliability.
  2. System Maintenance and Monitoring:

    • Performing routine maintenance tasks such as software updates, patches, and system upgrades to ensure optimal performance and security.
    • Monitoring system performance, resource utilization, and capacity planning to proactively address potential issues and bottlenecks.
  3. User Support and Training:

    • Providing technical support and troubleshooting assistance to users of the HPC systems.
    • Developing and delivering training sessions to educate users on best practices, usage guidelines, and efficient utilization of HPC resources.
  4. Security and Compliance:

    • Implementing and maintaining security protocols, access controls, and data protection measures to safeguard HPC infrastructure and sensitive data.
    • Ensuring compliance with relevant regulatory requirements and organizational policies related to HPC operations.
  5. Documentation and Reporting:

    • Creating and maintaining comprehensive documentation including system configurations, operational procedures, and troubleshooting guides.
    • Generating regular reports on system performance, usage statistics, and operational metrics for management and stakeholders.Qualifications
  • Bachelor s degree in computer science, Information Technology, or a related field (or equivalent work experience).
  • Proven experience (8+ years) as an HPC Administrator or in a similar role managing HPC systems in a production environment.
  • Proficiency in configuring and managing HPC cluster software such as Slurm, NC, LSF or Grid Engine.
  • Strong knowledge of Linux/Unix system administration and shell scripting.
  • Experience with NFS and storage (NetApp/ISILON) and backup management in HPC environments.
  • Familiarity with networking principles, including TCP/IP, VLANs, and InfiniBand.
  • Excellent analytical and problem-solving skills with the ability to troubleshoot complex issues independently.
  • Strong communication skills and the ability to collaborate effectively with cross-functional and cross geography teams and end-users.

Preferred Skills:

  • Bachelor s degree in computer science, Engineering, or a related discipline.
  • Experience in HPC technologies (e.g., HPC Systems Professional, Cray Certified System Administrator).
  • Knowledge with containerization technologies (e.g., Docker, Singularity) and workload orchestration frameworks (e.g., Kubernetes) is a plus.
  • Knowledge of scripting languages like shell/Ansible commonly used in unix admin will be a plus.
  • Knowledge of Dell/CISCO UCS servers in HPC environments.
  • Semiconductor domain experience is a must.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Western Digital logo
Western Digital

Computer Hardware Manufacturing

San Jose CA

RecommendedJobs for You

greater kolkata area