IT architecture/engineering - SA (Johannesburg)

8 - 15 years

70 - 100 Lacs

Posted:4 days ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a highly skilled AI Infrastructure Architect with deep expertise in the Azure ecosystem and a strong background in infrastructure sizing and capacity planning. You will design, size, and optimize AI/ML infrastructure to deliver robust, scalable, and cost-eDicient AI solutions. Your work will ensure that our AI workloads performed decently, reliably, and securelywhile aligning with business objectives and growth projections.

Key Responsibilities

Architecture & Design

Design and architect AI/ML infrastructure solutions on Azure, ensuring scalability, high availability, and robust performance.

Select and integrate Azure services (Compute, Storage, Networking, Azure Machine Learning, Databricks, Data Lake, Synapse, AKS) to meet current and future AI workload requirements.

Sizing & Capacity Planning

Lead sizing assessments and capacity planning for AI/ML infrastructure, including compute, storage, networking, and GPU resources.

Analyze workload characteristics, growth forecasts, and usage patterns to recommend right-sized and cost-optimized Azure solutions.

Develop and maintain tools, models, or guidelines for infrastructure sizing and scaling (horizontal and vertical).

Implementation & Optimization

Oversee deployment and configuration of AI platforms and supporting infrastructure using Infrastructure-as-Code (Terraform, ARM, Bicep).

Continuously monitor and optimize infrastructure usage, performance, and cost, adjusting sizing recommendations as needed.

Data Pipeline & Integration

Design and size end-to-end data pipelines for ingesting, processing, and storing large data volumes using Azure Data Factory, Data Lake, and Synapse Analytics.

Security & Compliance

Ensure security, compliance, and data governance in all infrastructure designs and deployments.

Collaboration & Documentation

Collaborate with data science, engineering, operations, and business teams to gather requirements, provide technical leadership, and translate business needs into scalable Azure solutions.

Develop and maintain architecture diagrams, sizing documentation, and operational runbooks.

Continuous Improvement

Stay current with Azure advancements, industry best practices, and new tools relevant to AI infrastructure and capacity management.

Required Skills & Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, or related field.

8+ years of experience in IT architecture, with at least 3 years focused on AI/ML infrastructure.

Deep expertise in Azure Cloud Platform: Including Azure Machine Learning,

Azure Databricks, Azure Kubernetes Service (AKS), Azure Synapse, Data Lake, etc.

Strong understanding of AI/ML model lifecycle (development, training, deployment, monitoring).

Proficiency in Infrastructure-as-Code (Terraform, ARM, Bicep) and DevOps tools (Azure DevOps, GitHub Actions, CI/CD pipelines).

Experience with containerization and orchestration (Docker, Kubernetes).

Solid understanding of cloud security, identity, and access management.

Excellent communication and stakeholder management skills.

Understanding of GPU, CPU, and storage sizing for different ML workloads (training vs. inference, batch vs. real-time).

Solid grasp of cloud security, identity management, and compliance.

Excellent communication, documentation, and stakeholder management skills.

Azure certifications (Solutions Architect, AI Engineer, or related) preferred.

Preferred Qualifications

Experience with hybrid and multi-cloud AI deployments.

Knowledge of responsible AI, model governance, and MLOps best practices.

Prior experience in telecom, financial services, or large enterprise environments.

Familiarity with open-source AI/ML frameworks (TensorFlow, PyTorch, MLflow).

Experience sizing AI infrastructure for large-scale, high-throughput, or missioncritical workloads.

Knowledge of MLOps, model deployment, monitoring, and cost management best practices.

Familiarity with hybrid or multi-cloud environments.

Prior work in telecom, fintech, or regulated industries.

To proceed further, kindly share your updated resume on


Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

pune, maharashtra, india