8 - 13 years
2 Lacs
Posted:3 days ago|
Platform:
Work from Office
Full Time
PS- Global Competency Center
Hewlett Packard Enterprise
Job Title Lead Solutions Architect AI Infrastructure & Private Cloud
Job Description:
We are seeking an experienced Lead Solutions Architect with deep expertise in AI/ML infrastructure , High Performance Computing (HPC), and container platforms to join our dynamic team focused on delivering HPE Private Cloud AI and Enterprise AI Factory Solutions. This role is instrumental in architecting, deploying, and optimizing private cloud environments that leverage HPE’s co-developed solutions with NVIDIA, as well as validated HPE reference architectures, to support enterprise-grade AI workloads at scale.
The ideal candidate will bring strong technical expertise in AI infrastructure, container orchestration platforms, and hybrid cloud environments, and will play a key role in delivering scalable, secure, and high-performance AI platform solutions powered by HPE GreenLake and NVIDIA AI Enterprise technologies.
Key Responsibilities:
6. Team Collaboration:
Collaborate with cross-functional teams, including subject matter experts in infrastructure components such as HPE servers, storage, networking and data science teams to ensure cohesive and integrated solution delivery.
Mentor technical consultants and contribute to internal knowledge sharing through tech talks and innovation forums.
Required Skills:
1. HPC & AI Infrastructure
Extensive knowledge of HPC technologies and workload scheduler such as Slurm and/or Altair PBS Pro,
Proficient in HPC cluster management tools, including HPE Cluster Management (HPCM) and/or NVIDIA Base Command Manager.
Experience with HPC cluster managers like HPE Cluster Management (HPCM) and/or NVIDIA Base Command Manager.
Good understanding with high-speed networking stacks (InfiniBand, Mellanox) and performance tuning of HPC components.
Solid grasp of high-speed networking technologies, such as InfiniBand and Ethernet.
2. Containerization & Orchestration
Extensive hands-on experience with containerization technologies such as Docker, Podman, and Singularity
Proficiency with at least two container orchestration platforms: CNCF Kubernetes, Red Hat OpenShift, SUSE Rancher (RKE/K3S), Canonical Charmed Kubernetes.
Strong understanding of GPU technologies, including the NVIDIA GPU Operator for Kubernetes-based environments and DCGM (Data Center GPU Manager) for GPU health and performance monitoring.
3.Operating Systems & Virtualization
Extensive experience in Linux system administration, including package management, boot process troubleshooting, performance tuning, and network configuration.
Proficient with multiple Linux distributions, with hands-on expertise in at least two of the following: RHEL, SLES, and Ubuntu.
Experience with virtualization technologies, including KVM and OpenShift Virtualization, for deploying and managing virtualized workloads in hybrid cloud environments.
4. Cloud, DevOps & MLOps
Solid understanding of hybrid cloud architectures and experience working with major cloud platforms in conjunction with on-premises infrastructure.
Familiarity with DevOps practices, including CI/CD pipelines, infrastructure as code (IaC), and microservices-based application delivery.
Experience integrating and operationalizing open-source AI/ML tools and frameworks, supporting the full model lifecycle from development to deployment.
Good understanding of cloud-native security, observability, and compliance frameworks, ensuring secure and reliable AI/ML operations at scale.
5. Networking & Protocols
Strong understanding of core networking principles, including DNS, TCP/IP, routing, and load balancing, essential for designing resilient and scalable infrastructure.
Working knowledge of key network protocols, such as S3, NFS, and SMB/CIFS, for data access, transfer, and integration across hybrid environments.
6. Programming & Automation
Proficiency in scripting or programming languages such as Python and Bash.
Experience automating infrastructure and AI workflows.
7. Soft Skills & Leadership
Excellent problem-solving, analytical thinking, and communication skills for engaging both technical and non-technical stakeholders.
Proven ability to lead complex technical projects from requirements gathering through architecture, design, and delivery.
Strong business acumen with the ability to align technical solutions with client challenges and objectives.
Qualifications:
Algoleap Technologies
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
2.0 - 2.5 Lacs P.A.
noida
11.0 - 16.0 Lacs P.A.
hyderabad
10.0 - 14.0 Lacs P.A.
hyderabad
10.0 - 14.0 Lacs P.A.
bengaluru
13.0 - 17.0 Lacs P.A.
10.0 - 14.0 Lacs P.A.
10.0 - 14.0 Lacs P.A.
30.0 - 35.0 Lacs P.A.
bengaluru
20.0 - 25.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.