Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
2.0 - 10.0 years
0 Lacs
coimbatore, tamil nadu
On-site
You should have 3 to 10 years of experience in AI development and be located in Coimbatore. Immediate joiners are preferred. A minimum of 2 years of experience in core Gen AI is required. As an AI Developer, your responsibilities will include designing, developing, and fine-tuning Large Language Models (LLMs) for various in-house applications. You will implement and optimize Retrieval-Augmented Generation (RAG) techniques to enhance AI response quality. Additionally, you will develop and deploy Agentic AI systems capable of autonomous decision-making and task execution. Building and managing data pipelines for processing, transforming, and feeding structured/unstructured data into AI models will be part of your role. It is essential to ensure scalability, performance, and security of AI-driven solutions in production environments. Collaboration with cross-functional teams, including data engineers, software developers, and product managers, is expected. You will conduct experiments and evaluations to improve AI system accuracy and efficiency while staying updated with the latest advancements in AI/ML research, open-source models, and industry best practices. You should have strong experience in LLM fine-tuning using frameworks like Hugging Face, DeepSpeed, or LoRA/PEFT. Hands-on experience with RAG architectures, including vector databases such as Pinecone, ChromaDB, Weaviate, OpenSearch, and FAISS, is required. Experience in building AI agents using LangChain, LangGraph, CrewAI, AutoGPT, or similar frameworks is preferred. Proficiency in Python and deep learning frameworks like PyTorch or TensorFlow is necessary. Experience in Python web frameworks such as FastAPI, Django, or Flask is expected. You should also have experience in designing and managing data pipelines using tools like Apache Airflow, Kafka, or Spark. Knowledge of cloud platforms (AWS/GCP/Azure) and containerization technologies (Docker, Kubernetes) is essential. Familiarity with LLM APIs (OpenAI, Anthropic, Mistral, Cohere, Llama, etc.) and their integration in applications is a plus. A strong understanding of vector search, embedding models, and hybrid retrieval techniques is required. Experience with optimizing inference and serving AI models in real-time production systems is beneficial. Experience with multi-modal AI (text, image, audio) and familiarity with privacy-preserving AI techniques and responsible AI frameworks are desirable. Understanding of MLOps best practices, including model versioning, monitoring, and deployment automation, is a plus. Skills required for this role include PyTorch, RAG architectures, OpenSearch, Weaviate, Docker, LLM fine-tuning, ChromaDB, Apache Airflow, LoRA, Python, hybrid retrieval techniques, Django, GCP, CrewAI, OpenAI, Hugging Face, Gen AI, Pinecone, FAISS, AWS, AutoGPT, embedding models, Flask, FastAPI, LLM APIs, DeepSpeed, vector search, PEFT, LangChain, Azure, Spark, Kubernetes, AI Gen, TensorFlow, real-time production systems, LangGraph, and Kafka.,
Posted 14 hours ago
2.0 - 6.0 years
0 Lacs
surat, gujarat
On-site
The primary responsibility of this role is to design, develop, and implement cutting-edge image and video generation systems leveraging deep learning models. You will take the lead in exploring and prototyping diffusion, GAN, and transformer-based architectures for generative tasks. Your expertise will be instrumental in optimizing models for quality, speed, and scalability through accelerated compute technologies such as CUDA and TensorRT. Collaboration with cross-functional teams including Product, Design, and Frontend will be essential to seamlessly integrate AI pipelines into production applications and platforms. Additionally, you will play a key role in contributing to system architecture, ensuring reproducibility, versioning, and model evaluation, while also staying updated on the latest advancements in generative AI to facilitate the transition from research and development to production. To excel in this role, you should possess a minimum of 2 years of hands-on experience in the field of AI/ML with a strong emphasis on generative models. Your track record should include practical experience with video generation models like Sora, Gen-2 by Runway, Synthesia, or custom pipelines. A solid background in image generation using Diffusion Models (e.g., Stable Diffusion, DALLE, Imagen) or GANs (e.g., StyleGAN2/3) is essential. Proficiency in Python and deep learning libraries such as PyTorch, TensorFlow, or JAX is required, along with experience in training large-scale models using multi-GPU setups like DDP, DeepSpeed, or Hugging Face Accelerate. A sound understanding of computer vision, image processing, and neural rendering techniques is crucial, as well as practical skills in model fine-tuning and related methodologies like LoRA/PEFT, ControlNet, DreamBooth, and others. Preferred tools and frameworks for this role include Stable Diffusion, DALLE, MidJourney, Sora, Gen-2, VQ-GAN, Pix2Pix, CycleGAN, AnimateDiff, ControlNet, T2I-Adapter, VideoCrafter, Pika Labs, ZeroScope, and ModelScope. Proficiency in FastAPI, Flask, or gRPC for model serving and Streamlit, Gradio, or React for rapid prototyping is advantageous. Experience with cloud platforms such as AWS, GCP, or Azure, particularly with GPU instances, and serving models using TorchServe, NVIDIA Triton, or Vertex AI, will be beneficial in ensuring scalable model deployment. This is a full-time position with a flexible schedule and a day shift from Monday to Friday. The ideal candidate will have a minimum of 2 years of experience in machine learning. The work location is in person, and the expected start date is 01/08/2025.,
Posted 2 days ago
12.0 - 14.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
Our vision is to transform how the world uses information to enrich life for . Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. Principal / Senior Systems Performance Engineer Micron Data Center and Client Workload Engineering in Hyderabad, India, is seeking a senior/principal engineer to join our dynamic team. The successful candidate will primarily contribute to the ML development, ML DevOps, HBM program in the data center by analyzing how AI/ML workloads perform on the latest MU-HBM, Micron main memory, expansion memory and near memory (HBM/LP) solutions, conduct competitive analysis, showcase the benefits that workloads see with MU-HBM's capacity / bandwidth / thermals, contribute to marketing collateral, and extract AI/ML workload traces to help optimize future HBM designs. Job Responsibilities: The Job Responsibilities include but are not limited to the following: Design, implement, and maintain scalable & reliable ML infrastructure and pipelines. Collaborate with data scientists and ML engineers to deploy machine learning models into production environments. Automate and optimize ML workflows, including data preprocessing, model training, evaluation, and deployment. Monitor and manage the performance, reliability, and scalability of ML systems. Troubleshoot and resolve issues related to ML infrastructure and deployments. Implement and manage distributed training and inference solutions to enhance model performance and scalability. Utilize DeepSpeed, TensorRT, vLLM for optimizing and accelerating AI inference and training processes. Understand key care abouts when it comes to ML models such as: transformer architectures, precision, quantization, distillation, attention span & KV cache, MoE, etc. Build workload memory access traces from AI models. Study system balance ratios for DRAM to HBM in terms of capacity and bandwidth to understand and model TCO. Study data movement between CPU, GPU and the associated memory subsystems (DDR, HBM) in heterogeneous system architectures via connectivity such as PCIe/NVLINK/Infinity Fabric to understand the bottlenecks in data movement for different workloads. Develop an automated testing framework through scripting. Customer engagements and conference presentations to showcase findings and develop whitepapers. Requirements: Strong programming skills in Python and familiarity with ML frameworks such as TensorFlow, PyTorch, or scikit-learn. Experience in data preparation: cleaning, splitting, and transforming data for training, validation, and testing. Proficiency in model training and development: creating and training machine learning models. Expertise in model evaluation: testing models to assess their performance. Skills in model deployment: launching server, live inference, batched inference Experience with AI inference and distributed training techniques. Strong foundation in GPU and CPU processor architecture Familiarity with and knowledge of server system memory (DRAM) Strong experience with benchmarking and performance analysis Strong software development skills using leading scripting, programming languages and technologies (Python, CUDA, C, C++) Familiarity with PCIe and NVLINK connectivity Preferred Qualifications: Experience in quickly building AI workflows: building pipelines and model workflows to design, deploy, and manage consistent model delivery. Ability to easily deploy models anywhere: using managed endpoints to deploy models and workflows across accessible CPU and GPU machines. Understanding of MLOps: the overarching concept covering the core tools, processes, and best practices for end-to-end machine learning system development and operations in production. Knowledge of GenAIOps: extending MLOps to develop and operationalize generative AI solutions, including the management of and interaction with a foundation model. Familiarity with LLMOps: focused specifically on developing and productionizing LLM-based solutions. Experience with RAGOps: focusing on the delivery and operation of RAGs, considered the ultimate reference architecture for generative AI and LLMs. Data management: collect, ingest, store, process, and label data for training and evaluation. Configure role-based access control dataset search, browsing, and exploration data provenance tracking, data logging, dataset versioning, metadata indexing, data quality validation, dataset cards, and dashboards for data visualization. Workflow and pipeline management: work with cloud resources or a local workstation connect data preparation, model training, model evaluation, model optimization, and model deployment steps into an end-to-end automated and scalable workflow combining data and compute. Model management: train, evaluate, and optimize models for production store and version models along with their model cards in a centralized model registry assess model risks, and ensure compliance with standards. Experiment management and observability: track and compare different machine learning model experiments, including changes in training data, models, and hyperparameters. Automatically search the space of possible model architectures and hyperparameters for a given model architecture analyze model performance during inference, monitor model inputs and outputs for concept drift. Synthetic data management: extend data management with a new native generative AI capability. Generate synthetic training data through domain randomization to increase transfer learning capabilities. Declaratively define and generate edge cases to evaluate, validate, and certify model accuracy and robustness. Embedding management: represent data samples of any modality as dense multi-dimensional embedding vectors generate, store, and version embeddings in a vector database. Visualize embeddings for improvised exploration. Find relevant contextual information through vector similarity search for RAGs. Education: Bachelor's or higher (with 12+ years of experience) in Computer Science or related field.
Posted 1 week ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
31458 Jobs | Dublin
Wipro
16542 Jobs | Bengaluru
EY
10788 Jobs | London
Accenture in India
10711 Jobs | Dublin 2
Amazon
8660 Jobs | Seattle,WA
Uplers
8559 Jobs | Ahmedabad
IBM
7988 Jobs | Armonk
Oracle
7535 Jobs | Redwood City
Muthoot FinCorp (MFL)
6170 Jobs | New Delhi
Capgemini
6091 Jobs | Paris,France