AI/ML Lead / Asso Architect

4 - 8 years

0 Lacs

Posted:1 day ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As an AI/ML Lead / Associate Architect, your main responsibilities will include: - Designing, implementing, and optimizing end-to-end ML training workflows, from infrastructure setup to deployment and monitoring. This involves evaluating and integrating multi-cloud and single-cloud training options across platforms like AWS, as well as leading cluster configuration, orchestration design, environment customization, and scaling strategies. - Comparing and recommending hardware options such as GPUs, TPUs, and accelerators based on factors like performance, cost, and availability. - Having at least 4-5 years of experience in AI/ML infrastructure and large-scale training environments, being an expert in AWS cloud services, and having familiarity with Azure, GCP, and hybrid/multi-cloud setups. - Demonstrating strong knowledge of AI/ML training frameworks like PyTorch, TensorFlow, Hugging Face, DeepSpeed, Megatron, and Ray, as well as experience with cluster orchestration tools like Kubernetes, Slurm, Ray, SageMaker, and Kubeflow. - Possessing a deep understanding of hardware architectures for AI workloads, including NVIDIA, AMD, Intel Habana, and TPU. - Utilizing expert knowledge of inference optimization techniques such as speculative decoding, KV cache optimization, dynamic batching, quantization methods, model parallelism strategies, and inference frameworks like vLLM, TensorRT-LLM, DeepSpeed-Inference, and TGI. - Having hands-on experience with serving frameworks like Triton Inference Server, KServe, Ray Serve, and kernel optimization libraries like FlashAttention and xFormers. - Demonstrating the ability to optimize inference metrics, resolve GPU memory bottlenecks, and implement hardware-specific optimizations for modern GPU architectures like A100 and H100. - Leading the fine-tuning process of LLMs, which includes model selection, dataset preparation, tokenization, and evaluation with baseline metrics. This involves configuring and executing fine-tuning experiments on large-scale compute setups, documenting outcomes, and benchmarking against baseline models. If you have only worked on POCs and not production-ready ML models that scale, please refrain from applying for this position.,

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Fission Labs logo
Fission Labs

Software Development

Sunnyvale CA

RecommendedJobs for You