Jobs
Interviews

13 Deepspeed Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 9.0 years

0 Lacs

haryana

On-site

The role requires you to lead the collaboration with ML Engineers and DevOps Engineers to formulate AI designs that can be built, tested, and deployed through the Route to Live and into Production using continuous integration/deployment. In this role, you will be responsible for Model Development & Deployment, including model fine-tuning using open-source libraries like DeepSpeed, Hugging Face Transformers, JAX, PyTorch, and TensorFlow to enhance model performance. You will also work on deploying and managing Large Language Models (LLMs) on cloud platforms, training and refining LLMs, and scaling LLMs up and down while ensuring blue/green deployments and roll back bad releases. Your tasks will also involve Data Management & Pipeline Operations, such as curating and preparing training data, monitoring data quality, transforming and aggregating data, building vector databases, and making data visible and shareable across teams. Monitoring & Evaluation will be a crucial part of your role, where you will track LLM performance, identify errors, optimize models, and create model and data monitoring pipelines with alerts for model drift and malicious user behavior. Infrastructure & DevOps tasks will include continuous integration and delivery (CI/CD), managing infrastructure for distributed model training using tools like SageMaker, Ray, Kubernetes, and deploying ML models using containerization technologies like Docker. Required Technical Skills for the role include proficiency in programming languages like Python, frameworks like PyTorch and TensorFlow, expertise in cloud platforms like AWS, Azure, or GCP, experience with containerization technologies, and familiarity with LLM-Specific Technologies such as vector databases, prompt engineering, and fine-tuning techniques. The position is located at DGS India in Pune, Baner, and falls under the Brand Merkle. It is a Full-Time role with a Permanent contract.,

Posted 2 days ago

Apply

7.0 - 10.0 years

30 - 35 Lacs

pune

Hybrid

Generative AI Practice Head Location: Pune, Maharashtra (Hybrid) Experience: 7-10 years About the Role Evonence is seeking a Generative AI Practice Head to spearhead its AI initiativesdriving innovation, shaping strategy, and delivering production-ready generative AI solutions. This leadership role requires a blend of deep research expertise, hands-on engineering skills, and team mentorship capabilities. Key Responsibilities Lead the design, development, and deployment of state-of-the-art generative AI systems. Stay ahead of the curve by evaluating and adopting cutting-edge AI research (e.g., NeurIPS, CVPR, ICLR). Serve as the authority and visionary for the company’s generative AI direction. Architect and optimize production-grade AI models using transformers, GANs, and diffusion models. Ensure performance, scalability, and security of deployed solutions, leveraging distributed platforms. Build and fine-tune large-scale multimodal AI models for NLP, computer vision, or cross-domain tasks. Implement Responsible AI principles —mitigating bias, enhancing explainability, and ensuring fairness. Act as a bridge between research and product teams , ensuring smooth integration of AI capabilities. Mentor junior engineers, foster cross-functional collaboration , and shape the organization’s AI roadmap. Required Skills & Qualifications Master’s or PhD in AI/ML, Computer Science, or related fields . Proven expertise in transformer architectures, GANs, and diffusion models . Strong programming experience in Python, TensorFlow, JAX, PyTorch . Hands-on experience with distributed training and scalable deployment . Solid foundation in NLP, computer vision, or multimodal AI applications . Good to Have Experience with Google Cloud Platform (GCP) and related AI services. Research publications or patents in top-tier AI conferences/journals. Contributions to open-source AI libraries (e.g., HuggingFace, TensorFlow, Diffusers). Familiarity with AI ethics frameworks and tools like Fairness Indicators, Explainable AI (XAI).

Posted 5 days ago

Apply

2.0 - 6.0 years

0 Lacs

maharashtra

On-site

As a member of our team, your primary responsibility will be the development and training of foundational models across various modalities. You will be involved in the end-to-end lifecycle management of foundational model development, starting from data curation to model deployment, by collaborating closely with core team members. Your role will also entail conducting research to enhance model accuracy and efficiency, as well as implementing state-of-the-art AI techniques in Text/Speech and language processing. Collaboration with cross-functional teams will be essential as you work towards building robust AI stacks and seamlessly integrating them into production pipelines. You will be expected to develop pipelines for debugging, CI/CD, and ensuring observability throughout the development process. Demonstrating your ability to lead projects and offer innovative solutions will be crucial, along with documenting technical processes, model architectures, and experimental results, while maintaining clear and organized code repositories. Ideally, you should hold a Bachelor's or Master's degree in a related field and possess 2 to 5 years of industry experience in applied AI/ML. Proficiency in Python programming is a must, along with familiarity with a selection of tools such as TensorFlow, PyTorch, HF Transformers, NeMo, SLURM, Ray, Pytorch DDP, Deepspeed, NCCL, Git, DVC, MLFlow, W&B, KubeFlow, Dask, Milvus, Apache Spark, Numpy, Whisper, Voicebox, VALL-E, HuBERT/Unitspeech, LLMOPs Tools, Dockers, DSPy, Langgraph, langchain, and llamaindex. If you are passionate about AI and eager to contribute to cutting-edge projects in the field, we welcome your application to join our dynamic team.,

Posted 1 week ago

Apply

3.0 - 7.0 years

0 Lacs

vadodara, gujarat

On-site

Dharmakit Networks is a premium global IT solutions partner dedicated to innovation and success worldwide. Specializing in website development, SaaS, digital marketing, AI Solutions, and more, we help brands turn their ideas into high-impact digital products. Known for blending global standards with deep Indian insight, we are now stepping into our most exciting chapter yet. Project Ax1 is our next-generation Large Language Model (LLM), a powerful AI initiative designed to make intelligence accessible and impactful for Bharat and the world. Built by a team of AI experts, Dharmakit Networks is committed to developing cost-effective, high-performance AI tailored for India and beyond, enabling enterprises to unlock new opportunities and drive deeper connections. Join us in reshaping the future of AI, starting from India. As a GPU Infrastructure Engineer, you will be at the core of building, optimizing, and scaling the GPU and AI compute infrastructure that powers Project Ax1. Your responsibilities will include designing, deploying, and optimizing GPU infrastructure for large-scale AI workloads, managing GPU clusters across cloud (AWS, Azure, GCP) and on-prem setups, setting up and maintaining model CI/CD pipelines for efficient training and deployment, optimizing LLM inference using TensorRT, ONNX, Nvidia NVCF, and more. You will also be responsible for managing offline/edge deployments of AI models, building and tuning data pipelines to support real-time and batch processing, monitoring model and infra performance for availability, latency, and cost efficiency, and implementing logging, monitoring, and alerting using tools like Prometheus, Grafana, ELK, CloudWatch. Collaboration with AI Experts, ML Experts, backend Experts, and full-stack teams will be essential to ensure seamless model delivery. **Key Responsibilities:** - Design, deploy, and optimize GPU infrastructure for large-scale AI workloads. - Manage GPU clusters across cloud (AWS, Azure, GCP) and on-prem setups. - Set up and maintain model CI/CD pipelines for efficient training and deployment. - Optimize LLM inference using TensorRT, ONNX, Nvidia NVCF, etc. - Manage offline/edge deployments of AI models (e.g., CUDA, Lambda, containerized AI). - Build and tune data pipelines to support real-time and batch processing. - Monitor model and infra performance for availability, latency, and cost efficiency. - Implement logging, monitoring, and alerting using Prometheus, Grafana, ELK, CloudWatch. - Work closely with AI Experts, ML Experts, backend Experts, and full-stack teams to ensure seamless model delivery. **Must-Have Skills And Qualifications:** - Bachelors degree in Computer Science, Engineering, or related field. - Hands-on experience with Nvidia GPUs, CUDA, and deep learning model deployment. - Strong experience with AWS, Azure, or GCP GPU instance setup and scaling. - Proficiency in model CI/CD and automated ML workflows. - Experience with Terraform, Kubernetes, and Docker. - Familiarity with offline/edge AI, including quantization and optimization. - Logging & Monitoring using tools like Prometheus, Grafana, CloudWatch. - Experience with backend APIs, data processing workflows, and ML pipelines. - Experience with Git, collaboration in agile, cross-functional teams. - Strong analytical and debugging skills. - Excellent communication, teamwork, and problem-solving abilities. **Good To Have:** - Experience with Nvidia NVCF, DeepSpeed, vLLM, Hugging Face Triton. - Knowledge of FP16/INT8 quantization, pruning, and other optimization tricks. - Exposure to serverless AI inference (Lambda, SageMaker, Azure ML). - Contributions to open-source AI infrastructure projects or a strong GitHub portfolio showcasing ML model deployment expertise.,

Posted 1 week ago

Apply

0.0 years

0 Lacs

india

On-site

WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. AMD together we advance_ THE ROLE: T he AI Models team is looking for exceptional machine learning scientists and engineers to explore and innovate on training and inference techniques for large language models (LLMs), large multimodal models (LMMs), image/video generation and other foundation models . You will be part of a world-class research and development team focussing on efficient and scalable pre-training, instruction tuning, alignment and optimization . As an early member of the team, you can help us shap e the direction and strategy to fulfill this important charter. THE PERSON: This role is for you if you are passionate about reading through the latest literature, coming up with novel ideas, and implementing those through high quality code to push the boundaries on scale and performance. The ideal candidate will have both theoretical expertise and hands-on experience with developing LLMs, LMMs, and/or diffusion models. We are looking for someone who is familiar with hyper-parameter tuning methods, data preprocessing & encoding techniques and distributed training approaches for large models. KEY RESPONSIBILITIES: Pre-train and post-train models over large GPU clusters while optimizing for various trade-offs . Improve upon the state-of-the- art in G enerat ive AI model architectures, data and training techniques. Accelerate the training and inference speed across AMD accelerators . Build agentic frameworks to solve various kinds of problems Publish your research at top-tier conferences, workshops and/ or through technical blogs. Engage with academia and open-source ML communities. Drive continuous improvement of infrastructure and development ecosystem. PREFERRED EXPERIENCE: Strong development and debugging skills in Python. Experience in deep learning frameworks ( like PyTorch or TensorFlow ) and distributed training tools (like DeepSpeed or Pytorch Distributed ) . Experience with fine-tuning methods (like RLHF & DPO) as well as parameter efficient techniques (like LoRA & DoRA). Solid understanding o f various types of transformer s and state space models . Strong publication record in top-tier conferences, workshops or journals. S olid communication and problem-solving skills. Passionate about learning new stuffs in this domain as well as innovating on top of it ACADEMIC CREDENTIALS: Advanced degree ( Master's or PhD) in machine learning, computer science, artificial intelligence, or a related field is expected. Exceptional Bachelor's degree candidates may also be considered . #LI-MK1 Benefits offered are described: . AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants needs under the respective laws throughout all stages of the recruitment and selection process.

Posted 1 week ago

Apply

0.0 years

0 Lacs

bengaluru, karnataka, india

On-site

About the Role We are seeking a highly motivated and technically skilled Data Scientist to join our AI/ML Innovation team. This role focuses on developing cutting-edge solutions using Graph-based Retrieval-Augmented Generation (RAG) and fine-tuning Large Language Models (LLMs) to solve complex business problems and enhance decision-making capabilities. You will work closely with cross-functional teams including data engineers, ML researchers, and product managers to design, implement, and optimize intelligent systems that leverage structured knowledge graphs and advanced NLP techniques. Key Responsibilities Design and implement Graph RAG pipelines for enterprise-scale knowledge retrieval and contextual generation. Fine-tune LLMs (e.g., GPT, LLaMA, Mistral) using domain-specific datasets to improve performance on targeted tasks. Develop and maintain knowledge graphs and embeddings for semantic search and reasoning. Collaborate with engineering teams to deploy scalable AI solutions in production environments. Conduct experiments and evaluations to benchmark model performance and ensure robustness. Stay up-to-date with the latest research in NLP, graph neural networks, and LLM architectures. Required Qualifications Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or related field. Experience in applied machine learning or NLP. Hands-on experience with LLM fine-tuning using frameworks like Hugging Face Transformers, DeepSpeed, or LoRA. Strong understanding of graph databases (e.g., Neo4j, RDF, DGL) and knowledge graph construction. Proficiency in Python and ML libraries (PyTorch, TensorFlow, Scikit-learn). Familiarity with cloud platforms (AWS, Azure, GCP) and MLOps practices. About State Street What we do. State Street is one of the largest custodian banks, asset managers and asset intelligence companies in the world. From technology to product innovation, we're making our mark on the financial services industry. For more than two centuries, we've been helping our clients safeguard and steward the investments of millions of people. We provide investment servicing, data & analytics, investment research & trading and investment management to institutional clients. Work, Live and Grow. We make all efforts to create a great work environment. Our benefits packages are competitive and comprehensive. Details vary by location, but you may expect generous medical care, insurance and savings plans, among other perks. You'll have access to flexible Work Programs to help you match your needs. And our wealth of development programs and educational support will help you reach your full potential. Inclusion, Diversity and Social Responsibility. We truly believe our employees diverse backgrounds, experiences and perspectives are a powerful contributor to creating an inclusive environment where everyone can thrive and reach their maximum potential while adding value to both our organization and our clients. We warmly welcome candidates of diverse origin, background, ability, age, sexual orientation, gender identity and personality. Another fundamental value at State Street is active engagement with our communities around the world, both as a partner and a leader. You will have tools to help balance your professional and personal life, paid volunteer days, matching gift programs and access to employee networks that help you stay connected to what matters to you. State Street is an equal opportunity and affirmative action employer.

Posted 2 weeks ago

Apply

0.0 years

0 Lacs

bengaluru, karnataka, india

On-site

Description By applying to this position, your application will be considered for all locations we hire for in the United States. Annapurna Labs designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time agoeven yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world. Role AWS Neuron is the complete software stack for the AWS Trainium (Trn1/Trn2) and Inferentia (Inf1/Inf2) our cloud-scale Machine Learning accelerators. This role is for a Machine Learning Engineer on one of our AWS Neuron teams: The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience with training these large models using Python is a must. FSDP (Fully-Sharded Data Parallel), Deepspeed, Nemo and other distributed training libraries are central to this and extending all of this for the Neuron based system is key. ML?Frameworks partners with compiler, runtime, and research experts to make AWS?Trainium and?Inferentia feel native inside the tools builders already lovePyTorch, JAX, and the rapidly evolving vLLM ecosystem. By weaving Neuron?SDK deep into these frameworks, optimizing operators, and crafting targeted extensions, we unlock every teraflop of Annapurnas AI chips for both training and lightning?fast inference. Beyond kernels, we shape next?generation serving by upstreaming new features and driving scalable deployments with vLLM, Triton, and TensorRTturning breakthrough ideas into production?ready AI for millions of customers. The ML Inference team collaborates closely with hardware designers, software optimization experts, and systems engineers to develop and optimize high-performance inference solutions for Inferentia chips. Proficiency in deploying and optimizing ML models for inference using frameworks like TensorFlow, PyTorch, and ONNX is essential. The team focuses on techniques such as quantization, pruning, and model compression to enhance inference speed and efficiency. Adapting and extending popular inference libraries and tools for Neuron-based systems is a key aspect of their work. Key job responsibilities You&aposll join one of our core ML teams - Frameworks, Distributed Training, or Inference - to enhance machine learning capabilities on AWS&aposs specialized AI hardware. Your responsibilities will include improving PyTorch and JAX for distributed training on Trainium chips, optimizing ML models for efficient inference on Inferentia processors, and collaborating with compiler and runtime teams to maximize hardware performance. You&aposll also develop and integrate new features in ML frameworks to support AWS AI services. We seek candidates with strong programming skills, eagerness to learn complex systems, and basic ML knowledge. This role offers growth opportunities in ML infrastructure, bridging the gap between frameworks, distributed systems, and hardware acceleration. About The Team Annapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, Basic Qualifications To qualify, applicants should have earned (or will earn) a Bachelors or Masters degree between December 2022 and September 2025. Working knowledge of C++ and Python Experience with ML frameworks, particularly PyTorch, Jax, and/or vLLM Understanding of parallel computing concepts and CUDA programming Preferred Qualifications Experience in using analytical tools, such as Tableau, Qlikview, QuickSight Experience in building and driving adoption of new tools Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region youre applying in isnt listed, please contact your Recruiting Partner. Company - Annapurna Labs (U.S.) Inc. Job ID: A3029797 Show more Show less

Posted 2 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

maharashtra

On-site

We are looking for a skilled and enthusiastic Applied AI/ML Engineer to be a part of our team. As an Applied AI/ML Engineer, you will be responsible for leading the entire process of foundational model development, focusing on cutting-edge generative AI techniques. Your main objective will be to implement efficient learning methods for data and compute, specifically addressing challenges relevant to the Indian scenario. Your tasks will involve optimizing model training and inference pipelines, deploying production-ready models, ensuring scalability through distributed systems, and fine-tuning models for domain adaptation. Collaboration with various teams will be essential as you work towards building strong AI stacks and seamlessly integrating them into production pipelines. Apart from conducting research and experiments, you will be crucial in converting advanced models into operational systems that generate tangible results. Your leadership in this field will involve working closely with technical team members and subject matter experts, documenting technical processes, and maintaining well-structured codebases to encourage innovation and reproducibility. This position is perfect for proactive individuals who are passionate about spearheading significant advancements in generative AI and implementing scalable solutions for real-world impact. Your responsibilities will include: - Developing and training foundational models across different modalities - Managing the end-to-end lifecycle of foundational model development, from data curation to model deployment, through collaboration with core team members - Conducting research to enhance model accuracy and efficiency - Applying state-of-the-art AI techniques in Text/Speech and language processing - Collaborating with cross-functional teams to construct robust AI stacks and smoothly integrate them into production pipelines - Creating pipelines for debugging, CI/CD, and observability of the development process - Demonstrating project leadership and offering innovative solutions - Documenting technical processes, model architectures, and experimental outcomes, while maintaining clear and organized code repositories To be eligible for this role, you should hold a Bachelor's or Master's degree in a related field and possess 2 to 5 years of industry experience in applied AI/ML. Minimum requirements for this position include proficiency in Python programming and familiarity with 3-4 tools from the specified list below: - Foundational model libraries and frameworks (TensorFlow, PyTorch, HF Transformers, NeMo, etc) - Experience with distributed training (SLURM, Ray, Pytorch DDP, Deepspeed, NCCL, etc) - Inference servers (vLLM) - Version control systems and observability (Git, DVC, MLFlow, W&B, KubeFlow) - Data analysis and curation tools (Dask, Milvus, Apache Spark, Numpy) - Text-to-Speech tools (Whisper, Voicebox, VALL-E (X), HuBERT/Unitspeech) - LLMOPs Tools, Dockers, etc - Hands-on experience with AI application libraries and frameworks (DSPy, Langgraph, langchain, llamaindex, etc),

Posted 4 weeks ago

Apply

3.0 - 5.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Responsibilities Design and fine-tune LLMs (Large Language Models) for BFSI use-cases: intelligent document processing, report generation, chatbots, advisory tools. Evaluate and apply prompt engineering, retrieval-augmented generation (RAG), and fine-tuning methods. Implement safeguards, red-teaming, and audit mechanisms for LLM usage in BFSI. Work with data privacy, legal, and compliance teams to align GenAI outputs with industry regulations. Collaborate with enterprise architects to integrate GenAI into existing digital platforms. Qualifications 35 years in AI/ML; 13 years hands-on in GenAI/LLM-based solutions. BFSI-specific experience in document processing, regulatory reporting, or virtual agents using GenAI is highly preferred. Exposure to prompt safety, model alignment, and RAG pipelines is critical. Essential Skills Tech Stack LLMs: GPT (OpenAI), Claude, LLaMA, Mistral, Falcon Tools: LangChain, LlamaIndex, Pinecone, Weaviate Frameworks: Transformers (Hugging Face), PEFT, DeepSpeed APIs: OpenAI, Cohere, Anthropic, Azure OpenAI Cloud: GCP GenAI Studio, GCP Vertex AI Others: Prompt engineering, RAG, vector databases, role-based guardrails Experience 35 years in AI/ML; 13 years hands-on in GenAI/LLM-based solutions. Show more Show less

Posted 1 month ago

Apply

2.0 - 10.0 years

0 Lacs

coimbatore, tamil nadu

On-site

You should have 3 to 10 years of experience in AI development and be located in Coimbatore. Immediate joiners are preferred. A minimum of 2 years of experience in core Gen AI is required. As an AI Developer, your responsibilities will include designing, developing, and fine-tuning Large Language Models (LLMs) for various in-house applications. You will implement and optimize Retrieval-Augmented Generation (RAG) techniques to enhance AI response quality. Additionally, you will develop and deploy Agentic AI systems capable of autonomous decision-making and task execution. Building and managing data pipelines for processing, transforming, and feeding structured/unstructured data into AI models will be part of your role. It is essential to ensure scalability, performance, and security of AI-driven solutions in production environments. Collaboration with cross-functional teams, including data engineers, software developers, and product managers, is expected. You will conduct experiments and evaluations to improve AI system accuracy and efficiency while staying updated with the latest advancements in AI/ML research, open-source models, and industry best practices. You should have strong experience in LLM fine-tuning using frameworks like Hugging Face, DeepSpeed, or LoRA/PEFT. Hands-on experience with RAG architectures, including vector databases such as Pinecone, ChromaDB, Weaviate, OpenSearch, and FAISS, is required. Experience in building AI agents using LangChain, LangGraph, CrewAI, AutoGPT, or similar frameworks is preferred. Proficiency in Python and deep learning frameworks like PyTorch or TensorFlow is necessary. Experience in Python web frameworks such as FastAPI, Django, or Flask is expected. You should also have experience in designing and managing data pipelines using tools like Apache Airflow, Kafka, or Spark. Knowledge of cloud platforms (AWS/GCP/Azure) and containerization technologies (Docker, Kubernetes) is essential. Familiarity with LLM APIs (OpenAI, Anthropic, Mistral, Cohere, Llama, etc.) and their integration in applications is a plus. A strong understanding of vector search, embedding models, and hybrid retrieval techniques is required. Experience with optimizing inference and serving AI models in real-time production systems is beneficial. Experience with multi-modal AI (text, image, audio) and familiarity with privacy-preserving AI techniques and responsible AI frameworks are desirable. Understanding of MLOps best practices, including model versioning, monitoring, and deployment automation, is a plus. Skills required for this role include PyTorch, RAG architectures, OpenSearch, Weaviate, Docker, LLM fine-tuning, ChromaDB, Apache Airflow, LoRA, Python, hybrid retrieval techniques, Django, GCP, CrewAI, OpenAI, Hugging Face, Gen AI, Pinecone, FAISS, AWS, AutoGPT, embedding models, Flask, FastAPI, LLM APIs, DeepSpeed, vector search, PEFT, LangChain, Azure, Spark, Kubernetes, AI Gen, TensorFlow, real-time production systems, LangGraph, and Kafka.,

Posted 1 month ago

Apply

10.0 - 14.0 years

4 - 8 Lacs

Bengaluru, Karnataka, India

On-site

The Role: We are looking for Machine Learning Engineer to join our Models and Applications team. If the challenge of distributed training of large model on large number of GPUs excites you and you are passionate about improving training efficiency and enjoy innovating and coming up with new ideas, then this role is for you. You will be part of world class team focus on addressing the challenge of training generative AI. The Person: The ideal candidate should have experience with distributed training pipeline, knowledgeable with distributed training algorithms (Data parallel, Tensor parallel, Pipeline parallel, ZeRO) and familiar with training Large Model. Key Responsibilities: Train large model to convergence on AMD GPUs. Improve the end-to-end training pipeline performance. Optimize the distributed training pipeline and algorithm to scale out. Contribute your changes to open source. Up to date with latest training algorithms. Influence the direction of AMD AI platform. Cross team collaborate with various group and stakeholder. Preferred Experience: 10+ years of experience. Experience in ML frameworks such as PyTorch, JAX or Tensorflow. Experience with distributed training and distributed training framework such as DeepSpeed. Experience with LLM or Vision, especially large model is a plus. Excellent python programing skills, including debugging, profiling, and perf analysis. Experience with ML pipeline. Strong communication and problem-solving skills. Academic Credentials: A master's degree in computer science, artificial intelligence, machine learning, or a related field.

Posted 1 month ago

Apply

2.0 - 6.0 years

0 Lacs

surat, gujarat

On-site

The primary responsibility of this role is to design, develop, and implement cutting-edge image and video generation systems leveraging deep learning models. You will take the lead in exploring and prototyping diffusion, GAN, and transformer-based architectures for generative tasks. Your expertise will be instrumental in optimizing models for quality, speed, and scalability through accelerated compute technologies such as CUDA and TensorRT. Collaboration with cross-functional teams including Product, Design, and Frontend will be essential to seamlessly integrate AI pipelines into production applications and platforms. Additionally, you will play a key role in contributing to system architecture, ensuring reproducibility, versioning, and model evaluation, while also staying updated on the latest advancements in generative AI to facilitate the transition from research and development to production. To excel in this role, you should possess a minimum of 2 years of hands-on experience in the field of AI/ML with a strong emphasis on generative models. Your track record should include practical experience with video generation models like Sora, Gen-2 by Runway, Synthesia, or custom pipelines. A solid background in image generation using Diffusion Models (e.g., Stable Diffusion, DALLE, Imagen) or GANs (e.g., StyleGAN2/3) is essential. Proficiency in Python and deep learning libraries such as PyTorch, TensorFlow, or JAX is required, along with experience in training large-scale models using multi-GPU setups like DDP, DeepSpeed, or Hugging Face Accelerate. A sound understanding of computer vision, image processing, and neural rendering techniques is crucial, as well as practical skills in model fine-tuning and related methodologies like LoRA/PEFT, ControlNet, DreamBooth, and others. Preferred tools and frameworks for this role include Stable Diffusion, DALLE, MidJourney, Sora, Gen-2, VQ-GAN, Pix2Pix, CycleGAN, AnimateDiff, ControlNet, T2I-Adapter, VideoCrafter, Pika Labs, ZeroScope, and ModelScope. Proficiency in FastAPI, Flask, or gRPC for model serving and Streamlit, Gradio, or React for rapid prototyping is advantageous. Experience with cloud platforms such as AWS, GCP, or Azure, particularly with GPU instances, and serving models using TorchServe, NVIDIA Triton, or Vertex AI, will be beneficial in ensuring scalable model deployment. This is a full-time position with a flexible schedule and a day shift from Monday to Friday. The ideal candidate will have a minimum of 2 years of experience in machine learning. The work location is in person, and the expected start date is 01/08/2025.,

Posted 1 month ago

Apply

12.0 - 14.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Our vision is to transform how the world uses information to enrich life for . Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. Principal / Senior Systems Performance Engineer Micron Data Center and Client Workload Engineering in Hyderabad, India, is seeking a senior/principal engineer to join our dynamic team. The successful candidate will primarily contribute to the ML development, ML DevOps, HBM program in the data center by analyzing how AI/ML workloads perform on the latest MU-HBM, Micron main memory, expansion memory and near memory (HBM/LP) solutions, conduct competitive analysis, showcase the benefits that workloads see with MU-HBM's capacity / bandwidth / thermals, contribute to marketing collateral, and extract AI/ML workload traces to help optimize future HBM designs. Job Responsibilities: The Job Responsibilities include but are not limited to the following: Design, implement, and maintain scalable & reliable ML infrastructure and pipelines. Collaborate with data scientists and ML engineers to deploy machine learning models into production environments. Automate and optimize ML workflows, including data preprocessing, model training, evaluation, and deployment. Monitor and manage the performance, reliability, and scalability of ML systems. Troubleshoot and resolve issues related to ML infrastructure and deployments. Implement and manage distributed training and inference solutions to enhance model performance and scalability. Utilize DeepSpeed, TensorRT, vLLM for optimizing and accelerating AI inference and training processes. Understand key care abouts when it comes to ML models such as: transformer architectures, precision, quantization, distillation, attention span & KV cache, MoE, etc. Build workload memory access traces from AI models. Study system balance ratios for DRAM to HBM in terms of capacity and bandwidth to understand and model TCO. Study data movement between CPU, GPU and the associated memory subsystems (DDR, HBM) in heterogeneous system architectures via connectivity such as PCIe/NVLINK/Infinity Fabric to understand the bottlenecks in data movement for different workloads. Develop an automated testing framework through scripting. Customer engagements and conference presentations to showcase findings and develop whitepapers. Requirements: Strong programming skills in Python and familiarity with ML frameworks such as TensorFlow, PyTorch, or scikit-learn. Experience in data preparation: cleaning, splitting, and transforming data for training, validation, and testing. Proficiency in model training and development: creating and training machine learning models. Expertise in model evaluation: testing models to assess their performance. Skills in model deployment: launching server, live inference, batched inference Experience with AI inference and distributed training techniques. Strong foundation in GPU and CPU processor architecture Familiarity with and knowledge of server system memory (DRAM) Strong experience with benchmarking and performance analysis Strong software development skills using leading scripting, programming languages and technologies (Python, CUDA, C, C++) Familiarity with PCIe and NVLINK connectivity Preferred Qualifications: Experience in quickly building AI workflows: building pipelines and model workflows to design, deploy, and manage consistent model delivery. Ability to easily deploy models anywhere: using managed endpoints to deploy models and workflows across accessible CPU and GPU machines. Understanding of MLOps: the overarching concept covering the core tools, processes, and best practices for end-to-end machine learning system development and operations in production. Knowledge of GenAIOps: extending MLOps to develop and operationalize generative AI solutions, including the management of and interaction with a foundation model. Familiarity with LLMOps: focused specifically on developing and productionizing LLM-based solutions. Experience with RAGOps: focusing on the delivery and operation of RAGs, considered the ultimate reference architecture for generative AI and LLMs. Data management: collect, ingest, store, process, and label data for training and evaluation. Configure role-based access control dataset search, browsing, and exploration data provenance tracking, data logging, dataset versioning, metadata indexing, data quality validation, dataset cards, and dashboards for data visualization. Workflow and pipeline management: work with cloud resources or a local workstation connect data preparation, model training, model evaluation, model optimization, and model deployment steps into an end-to-end automated and scalable workflow combining data and compute. Model management: train, evaluate, and optimize models for production store and version models along with their model cards in a centralized model registry assess model risks, and ensure compliance with standards. Experiment management and observability: track and compare different machine learning model experiments, including changes in training data, models, and hyperparameters. Automatically search the space of possible model architectures and hyperparameters for a given model architecture analyze model performance during inference, monitor model inputs and outputs for concept drift. Synthetic data management: extend data management with a new native generative AI capability. Generate synthetic training data through domain randomization to increase transfer learning capabilities. Declaratively define and generate edge cases to evaluate, validate, and certify model accuracy and robustness. Embedding management: represent data samples of any modality as dense multi-dimensional embedding vectors generate, store, and version embeddings in a vector database. Visualize embeddings for improvised exploration. Find relevant contextual information through vector similarity search for RAGs. Education: Bachelor's or higher (with 12+ years of experience) in Computer Science or related field.

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies