AI Infrastructure Systems/Solutions Architect

10 - 15 years

20 - 27 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are looking for a

Systems or Solutions Architect

with deep expertise in

networking, infrastructure-as-a-service (IaaS), and cloud-scale system design

to help architect and optimize

AI/ML infrastructure

.

The ideal candidate combines strong fundamentals in

cloud architecture (AWS or equivalent)

,

networking

,

compute

, and

storage

, with hands-on experience in

Kubernetes, observability, and automation

.

You ll design scalable systems that support large AI workloads enabling efficient training, inference, and data pipelines across distributed environments.

Key Responsibilities

  • Architect and scale AI/ML infrastructure

    across public cloud (AWS / Azure / GCP) and hybrid environments.
  • Design and optimize compute, storage, and network topologies

    for distributed training and inference clusters.
  • Build and manage

    containerized environments

    using

    Kubernetes, Docker, and Helm

    .
  • Develop

    automation frameworks

    for provisioning, scaling, and monitoring infrastructure using

    Python, Go, and IaC (Terraform / CloudFormation)

    .
  • Partner with data science and ML Ops teams to align

    AI infrastructure requirements

    (GPU/CPU scaling, caching, throughput, latency).
  • Implement

    observability, logging, and tracing

    using

    Prometheus, Grafana, CloudWatch, or Open Telemetry

    .
  • Drive

    networking automation

    (BGP, routing, load balancing, VPNs, service meshes) using software-defined networking (SDN) and modern APIs.
  • Lead performance, reliability, and cost-optimization efforts for AI training and inference pipelines.
  • Collaborate cross-functionally with product, platform, and operations teams to ensure

    secure, performant, and resilient infrastructure

    .

Required Qualifications

  • Knowledge of

    AI/ML infrastructure patterns

    , including distributed training, inference pipelines, and GPU orchestration.
  • Bachelor s or Master s degree

    in Computer Science, Information Technology, or related field.
  • 10+ years of experience

    in systems, infrastructure, or solutions architecture roles.
  • Deep understanding of:
    • Cloud architecture:

      AWS (preferred), Azure, or GCP
    • Networking:

      VPC, Transit Gateway, DNS, routing, peering, load balancing, VPN
    • Compute and storage:

      EC2, ECS/EKS, S3, EBS, EFS, FSx, caching systems
    • Core infrastructure:

      virtualization, containers, distributed systems, and OS-level tuning
  • Proficiency in Linux systems engineering

    and

    scripting with Python and Bash

    .
  • Experience with Kubernetes

    (EKS/GKE/AKS) for large-scale workload orchestration.
  • Experience with Go (Golang)

    for infrastructure or network automation.
  • Familiarity with

    Infrastructure-as-Code (IaC)

    tools like Terraform, Ansible, or CloudFormation.
  • Experience implementing

    monitoring and observability systems

    (Prometheus, Grafana, ELK, Datadog, CloudWatch).

Preferred Qualifications

  • Experience with

    DevOps and MLOps ecosystems

    (SageMaker, Kubeflow, MLflow, Airflow).
  • AWS or cloud certifications such as

    Solutions Architect Professional

    or

    Advanced Networking Specialty

    .
  • Experience in

    performance benchmarking

    ,

    security hardening

    , and

    cost optimization

    for compute-intensive workloads.
  • Strong collaboration skills and ability to communicate complex infrastructure concepts clearly.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Bayone Solutions logo
Bayone Solutions

IT Services and IT Consulting

Pleasanton CA

RecommendedJobs for You