AI Infra Architect

15 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Description:


What You’ll Do

Architect the Future: Lead the end-to-end design and development of AI infrastructure, encompassing hardware, software, networking, and multi-cloud environments.

Innovate and Evaluate: Assess, select, and implement best-in-class technologies, tools, and frameworks (e.g., TensorFlow, PyTorch, Kubernetes, Docker) to build and maintain AI platforms.

Optimize for Performance: Engineer and implement scalable infrastructure that meets evolving AI/ML needs, continuously monitoring and optimizing for performance and cost-efficiency.

Champion Security and Compliance: Define and enforce infrastructure standards and best practices, ensuring compliance with security policies, data protection regulations, and ethical AI principles.


Build Data-Driven Pipelines: Collaborate on the architecture and implementation of efficient data pipelines for AI models, covering ingestion, storage, processing, and management.

Lead and Inspire: Provide technical leadership and mentorship to cross-functional teams, fostering a culture of excellence and best practices in AI infrastructure.

Solve Complex Challenges: Diagnose and resolve complex infrastructure issues to ensure high availability and reliability of AI systems.

Stay Ahead of the Curve: Keep up with advancements in AI, machine learning, and cloud computing to drive innovation within the organization.

Document for Success: Create and maintain comprehensive documentation for AI infrastructure designs, implementations, and operational procedures.

What You’ll Bring


Education

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.


Experience

15+ years of experience in infrastructure architecture.

At least 3–5 years dedicated to designing and building AI-specific infrastructure.

Proven success in deploying scalable and secure AI solutions in cloud environments.

Extensive hands-on experience with containerization and orchestration tools like Docker and Kubernetes.


Technical Skills

Proficiency with command-line operations and experience in both cloud-native and on-premise data center deployments.

Strong understanding of deep learning architectures and the latest advancements in Large Language Models (LLMs).

Expertise in NVIDIA hardware/software, including performance tuning and diagnostics.

Hands-on experience with GPU systems, including performance testing, tuning, and benchmarking.

Proficiency in programming languages such as Python.

In-depth knowledge of cloud service models (IaaS, PaaS, SaaS) and cloud-native architectures.

Strong background in networking, storage, and security best practices in a cloud context.

Experience with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.

Familiarity with DevOps and MLOps principles and practices.


Soft Skills

Exceptional problem-solving and analytical skills with a data-driven approach.

Excellent communication and interpersonal skills, capable of conveying complex technical concepts to diverse audiences.

Proven ability to lead, mentor, and collaborate effectively in team environments.

Strategic mindset with the ability to align technical solutions to business goals.

Proactive, adaptable, and committed to continuous learning in a fast-evolving technology landscape.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You