Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 7.0 years
0 Lacs
Noida, Uttar Pradesh, India
On-site
About the Role: We are seeking an experienced MLOps Engineer to lead the deployment, scaling, and performance optimization of open-source Generative AI models on cloud infrastructure. Youll work at the intersection of machine learning, DevOps, and cloud engineering to help productize and operationalize large-scale LLM and diffusion models. Key Responsibilities: Design and implement scalable deployment pipelines for open-source Gen AI models (LLMs, diffusion models, etc.). Fine-tune and optimize models using techniques like LoRA, quantization, distillation, etc. Manage inference workloads, latency optimization, and GPU utilization. Build CI/CD pipelines for model training, validation, and deployment. Integrate observability, logging, and alerting for model and infrastructure monitoring. Automate resource provisioning using Terraform, Helm, or similar tools on GCP/AWS/Azure. Ensure model versioning, reproducibility, and rollback using tools like MLflow, DVC, or Weights & Biases. Collaborate with data scientists, backend engineers, and DevOps teams to ensure smooth production rollouts. Required Skills & Qualifications: 5+ years of total experience in software engineering or cloud infrastructure. 3+ years in MLOps with direct experience in deploying large Gen AI models. Hands-on experience with open-source models (e.g., LLaMA, Mistral, Stable Diffusion, Falcon, etc.). Strong knowledge of Docker, Kubernetes, and cloud compute orchestration. Proficiency in Python and familiarity with model-serving frameworks (e.g., FastAPI, Triton Inference Server, Hugging Face Accelerate, vLLM). Experience with cloud platforms (GCP preferred, AWS or Azure acceptable). Familiarity with distributed training, checkpointing, and model parallelism. Good to Have: Experience with low-latency inference systems and token streaming architectures. Familiarity with cost optimization and scaling strategies for GPU-based workloads. Exposure to LLMOps tools (LangChain, BentoML, Ray Serve, etc.). Why Join Us: Opportunity to work on cutting-edge Gen AI applications across industries. Collaborative team with deep expertise in AI, cloud, and enterprise software. Flexible work environment with a focus on innovation and impact. Show more Show less
Posted 2 days ago
4.0 - 8.0 years
5 - 9 Lacs
Bengaluru, Karnataka, India
On-site
We are looking for a highly motivated and skilled AI Software architectto join our team. You will work with a team of Software Engineers to optimize DL models for inference and training, libraries, and applications for Instinct GPUs in both on-prem and Cloud environments. Candidates should be strong in Python and/or C++ and GPU programming. Candidates should also have experience analyzing and optimizing the performance of AI software and understand hardware bottlenecks and harness performance to hit close to roofline. Must be self-motivated and possess the ability to work well within a team environment. KEY QUALIFICATIONS: Strong programming skills in C++ and Python Strong development experience is at least one major DL framework such as vLLM, Pytorch or Tensorflow in inference and/or fine tuning and/or training on multi-node clusters Seeking solid experience in developing kernels, quantizing models and hyper parameter optimizations Experience developing software and system-level performance optimizations with a solid architecture understanding and roofline performance in GPUs MS with years of related experience or PhD with years of related experience in Computer Science or Computer Engineering or related equivalent. Experience with open-source software development including collaboration with community maintainers and submitting contributions is a plus Development experience in CK, Triton and other GPU programming a plus Publications in reputed peer-reviewed ML conferences/journals a plus Excellent analytical and problem-solving skills root-causing/addressing performance issues. Ability to work independently and as part of a team. Willingness to learn skills, tools, and methods to advance the quality, consistency, and timeliness of AMD AI products. PREFERRED EXPERIENCE: Expertise in profiling tools across the AI SW Stack (Torchprofiler, RocM profiler, Vtune, Nsight) Experience in implementing and optimizing parallel methods on GPU accelerators (NCCL/RCCL, OpenMP, MPI) Performance analysis skills for GPUs Experience providing clear and timely communication related to status and other key aspects of the project to leadership team.
Posted 4 days ago
1.0 - 6.0 years
4 - 8 Lacs
Hyderabad, Telangana, India
On-site
THE ROLE: As a Senior Software Developer, you will develop both GPU kernel-level optimization and distributed software efforts for large-scale AI workloads. This is a technical leadership role with direct influence over critical software components in AMD s AI stack. You ll architect and implement optimized compute kernels, guide software teams through the full product lifecycle, and work closely with internal and external partners to deploy scalable, high-performance solutions. THE PERSON: We re looking for a highly skilled, deep systems thinker who thrives in complex problem domains involving parallel computing, GPU architecture, and AI model execution. You are confident leading software architecture decisions and know how to translate business goals into robust, optimized software solutions. You re just as comfortable writing performance-critical code as you are guiding agile development teams across product lifecycles. Ideal candidates have a strong balance of low-level programming, distributed systems knowledge, and leadership experience paired with a passion for AI performance at scale. KEY RESPONSIBILITIES: GPU Kernel Optimization : Develop and optimize GPU kernels to accelerate inference and training of large machine learning models while ensuring numerical accuracy and runtime efficiency. Multi-GPU and Multi-Node Scaling:Architect and implement strategies for distributed training/inference across multi-GPU/multi-node environments using model/data parallelism techniques. Performance Profiling:Identify bottlenecks and performance limitations using profiling tools; propose and implement optimizations to improve hardware utilization. Parallel Computing : Design and implement multi-threaded and synchronized compute techniques for scalable execution on modern GPU architectures. Benchmarking & Testing:Build robust benchmarking and validation infrastructure to assess performance, reliability, and scalability of deployed software. Documentation & Best Practices:Produce technical documentation and share architectural patterns, code optimization tips, and reusable components. PREFERRED EXPERIENCE: Software Team Leadership Collaboration with customers and business units to define deliverables and roadmaps. Interfacing with executive leadership on program progress and strategic planning. Experience in production-level software deployment (e.g., upstreaming to open source, commercial rollouts). Software Architecture Deep experience with GPU kernel optimization in C++12/17/20 . Working knowledge of frameworks such as PyTorch, vLLM, Cutlass, Kokkos . Practical expertise in CPU/GPU architecture and system-level performance tuning. Proficiency in Python scripting and infrastructure automation. Application of software design patterns and industry-standard engineering practices. GPU & Low-Level Optimization Hands-on experience with CUDA and low-level GPU programming. Kernel optimization in assembly and tight loops for latency-sensitive code. Proficiency with performance profiling tools (Nsight, VTune, Perf, etc.). Experience with distributed computing strategies in AI environments (multi-GPU, NCCL, MPI). Strong debugging, problem-solving, and performance tuning skills in complex systems. ACADEMIC CREDENTIALS: Bachelor s or Master s degree in Computer Engineering, Electrical Engineering, Computer Science, or a related technical field. Advanced degrees or published work in HPC, GPU computing, or AI systems is a plus.
Posted 5 days ago
5.0 - 10.0 years
0 - 3 Lacs
Bengaluru, Mumbai (All Areas)
Hybrid
Role & responsibilities AI/ML Python developers with Devops 2-3 Deployment(Mandatory). Machine Learning Model Experience. Either AWS Services( Bedrock, sagemaker, EKS, Lambda) / Azure Services is Mandatory Candidates need to work on Gen AI Projects 5yrs - 8yrs - 17.5 LPA 8yrs + - 21 LPA Bangalore & Mumbai
Posted 1 week ago
3.0 - 5.0 years
8 - 18 Lacs
Bengaluru, Mumbai (All Areas)
Hybrid
Primary Responsibilities: Implement and manage AIOps platforms for intelligent monitoring, alerting, anomaly detection, and root cause analysis (RCA). Possess end-to-end knowledge of VLLM model hosting and inferencing. Advanced knowledge of public cloud platforms such as AWS and Azure. Build and maintain machine learning pipelines and models for predictive maintenance, anomaly detection, and noise reduction. Experience in production support and real-time issue handling. Design dashboards and visualizations to provide operational insights to stakeholders. Working knowledge of Bedrock, SageMaker, EKS, Lambda, etc. 1 to 2 years of experience with Jenkins and GoCD to make build/deploy pipelines. Hands-on experience with open-source and self-hosted model APIs using SDKs. Drive data-driven decisions by analyzing operational data and generating reports on system health, performance, and availability. • Basic knowledge of kserve and rayserve inferencing . • Good knowledge of high level scaling using Karpenter , Keda , System based vertical/horizontal scaling. • Strong knowledge on linux operating system or linux certified . • Previous experience with Helm chart deployments and Terraform template and module creation is highly recommended. Secondary Responsibilities: • Proven experience in AIOps and DevOps, with a strong background in cloud technologies (AWS, Azure, Google Cloud). • Proficiency in tools such as Kubeflow, Kserve, ONNX, and containerization technologies (Docker, Kubernetes). • Experience with enterprise-level infrastructure, including tools like terraform, helm, and On-Prem servers hosting. • Previous experience in fintech or AI based tech companies are highly desirable. • Demonstrates the ability to manage workloads effectively in a production environment. • Possesses excellent communication and collaboration skills, with a strong focus on cross-functional teamwork.
Posted 1 week ago
3.0 - 7.0 years
0 - 3 Lacs
Bengaluru, Mumbai (All Areas)
Hybrid
Role & responsibilities Primary Responsibilities: Implement and manage AIOps platforms for intelligent monitoring, alerting, anomaly detection, and root cause analysis (RCA). Possess end-to-end knowledge of VLLM model hosting and inferencing. Advanced knowledge of public cloud platforms such as AWS and Azure. Build and maintain machine learning pipelines and models for predictive maintenance, anomaly detection, and noise reduction. Experience in production support and real-time issue handling. Design dashboards and visualizations to provide operational insights to stakeholders. Working knowledge of Bedrock, SageMaker, EKS, Lambda, etc. 1 to 2 years of experience with Jenkins and GoCD to make build/deploy pipelines. Hands-on experience with open-source and self-hosted model APIs using SDKs. Drive data-driven decisions by analyzing operational data and generating reports on system health, performance, and availability. Basic knowledge of kserve and rayserve inferencing . Good knowledge of high level scaling using Karpenter , Keda , System based vertical/horizontal scaling. Strong knowledge on linux operating system or linux certified . Previous experience with Helm chart deployments and Terraform template and module creation is highly recommended. Secondary Responsibilities: Proven experience in AIOps and DevOps, with a strong background in cloud technologies (AWS, Azure, Google Cloud). Proficiency in tools such as Kubeflow, Kserve, ONNX, and containerization technologies (Docker, Kubernetes). Experience with enterprise-level infrastructure, including tools like terraform, helm, and On-Prem servers hosting. Previous experience in fintech or AI based tech companies are highly desirable. Demonstrates the ability to manage workloads effectively in a production environment. Possesses excellent communication and collaboration skills, with a strong focus on cross-functional teamwork.
Posted 1 week ago
2.0 - 6.0 years
0 Lacs
karnataka
On-site
Tata Electronics Pvt. Ltd. is a key global player in the electronics manufacturing industry, specializing in Electronics Manufacturing Services, Semiconductor Assembly & Test, Semiconductor Foundry, and Design Services. Established in 2020 by the Tata Group, the company's primary objective is to provide integrated solutions to global customers across the electronics and semiconductor value chain. We are looking for an AI Core Developer to join our R&D team in Bangalore. This role is centered around fundamental AI research, algorithm development, and model pre-training, focusing on innovation rather than application engineering. As an AI Core Developer, you will be involved in cutting-edge AI research, creating novel algorithms, and constructing foundation models from scratch. This position is ideal for individuals with a strong background in pre-training methodologies and algorithm development who aspire to contribute to core AI advancements. Your responsibilities will include developing and implementing innovative machine learning algorithms for various AI systems, designing pre-training pipelines for large models, prototyping new AI architectures, collaborating with research scientists and engineers, and contributing to technical publications. The ideal candidate should hold a Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field, with 2-4 years of hands-on experience in AI/ML development. Proficiency in Python, C/C++, knowledge of deep learning frameworks such as PyTorch and TensorFlow, and experience with model pre-training are essential requirements. Strong mathematical skills, familiarity with transformer architectures and attention mechanisms, and understanding of distributed computing are also key competencies. Preferred qualifications include advanced experience in multimodal AI systems, research contributions to top-tier AI conferences, and expertise in specific AI domains like healthcare or finance. The position is based in Bangalore, India, with a hybrid work arrangement and occasional travel for conferences and collaborations.,
Posted 1 week ago
3.0 - 6.0 years
0 - 3 Lacs
Bengaluru, Mumbai (All Areas)
Work from Office
Role & responsibilities Implement and manage AIOps platforms for intelligent monitoring, alerting, anomaly detection, and root cause analysis (RCA). Possess end-to-end knowledge of VLLM model hosting and inferencing. Advanced knowledge of public cloud platforms such as AWS and Azure. Build and maintain machine learning pipelines and models for predictive maintenance, anomaly detection, and noise reduction. Experience in production support and real-time issue handling. Design dashboards and visualizations to provide operational insights to stakeholders. Working knowledge of Bedrock, SageMaker, EKS, Lambda, etc. 1 to 2 years of experience with Jenkins and GoCD to make build/deploy pipelines. Hands-on experience with open-source and self-hosted model APIs using SDKs. Drive data-driven decisions by analyzing operational data and generating reports on system health, performance, and availability. Basic knowledge of kserve and rayserve inferencing . Good knowledge of high level scaling using Karpenter , Keda , System based vertical/horizontal scaling. Strong knowledge on linux operating system or linux certified . Previous experience with Helm chart deployments and Terraform template and module creation is highly recommended. Secondary Responsibilities: Proven experience in AIOps and DevOps, with a strong background in cloud technologies (AWS, Azure, Google Cloud). Proficiency in tools such as Kubeflow, Kserve, ONNX, and containerization technologies (Docker, Kubernetes). Experience with enterprise-level infrastructure, including tools like terraform, helm, and On-Prem servers hosting. Previous experience in fintech or AI based tech companies are highly desirable. Demonstrates the ability to manage workloads effectively in a production environment. Possesses excellent communication and collaboration skills, with a strong focus on cross-functional teamwork.
Posted 1 week ago
3.0 - 7.0 years
0 Lacs
hyderabad, telangana
On-site
You will be responsible for designing, building, and deploying scalable NLP/ML models for real-world applications. Your role will involve fine-tuning and optimizing Large Language Models (LLMs) using techniques like LoRA, PEFT, or QLoRA. You will work with transformer-based architectures such as BERT, GPT, LLaMA, and T5, and develop GenAI applications using frameworks like LangChain, Hugging Face, OpenAI API, or RAG (Retrieval-Augmented Generation). Writing clean, efficient, and testable Python code will be a crucial part of your tasks. Collaboration with data scientists, software engineers, and stakeholders to define AI-driven solutions will also be an essential aspect of your work. Additionally, you will evaluate model performance and iterate rapidly based on user feedback and metrics. The ideal candidate should have a minimum of 3 years of experience in Python programming with a strong understanding of ML pipelines. A solid background and experience in NLP, including text preprocessing, embeddings, NER, and sentiment analysis, are required. Proficiency in ML libraries such as scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers, and spaCy is essential. Experience with GenAI concepts, including prompt engineering, LLM fine-tuning, and vector databases like FAISS and ChromaDB, will be beneficial. Strong problem-solving and communication skills are highly valued, along with the ability to learn new tools and work both independently and collaboratively in a fast-paced environment. Attention to detail and accuracy is crucial for this role. Preferred skills include theoretical knowledge or experience in Data Engineering, Data Science, AI, ML, RPA, or related domains. Certification in Business Analysis or Project Management from a recognized institution is a plus. Experience in working with agile methodologies such as Scrum or Kanban is desirable. Additional experience in deep learning and transformer architectures and models, prompt engineering, training LLMs, and GenAI pipeline preparation will be advantageous. Practical experience in integrating LLM models like ChatGPT, Gemini, Claude, etc., with context-aware capabilities using RAG or fine-tuning models is a plus. Knowledge of model evaluation and alignment, as well as metrics to calculate model accuracy, is beneficial. Data curation from sources for RAG preprocessing and development of LLM pipelines is an added advantage. Proficiency in scalable deployment and logging tooling, including skills like Flask, Django, FastAPI, APIs, Docker containerization, and Kubeflow, is preferred. Familiarity with Lang Chain, LlamaIndex, vLLM, HuggingFace Transformers, LoRA, and a basic understanding of cost-to-performance tradeoffs will be beneficial for this role.,
Posted 2 weeks ago
5.0 - 9.0 years
0 Lacs
kochi, kerala
On-site
As a highly skilled Senior Machine Learning Engineer, you will leverage your expertise in Deep Learning, Large Language Models (LLMs), and MLOps/LLMOps to design, optimize, and deploy cutting-edge AI solutions. Your responsibilities will include developing and scaling deep learning models, fine-tuning LLMs (e.g., GPT, Llama), and implementing robust deployment pipelines for production environments. You will be responsible for designing, training, fine-tuning, and optimizing deep learning models (CNNs, RNNs, Transformers) for various applications such as NLP, computer vision, or multimodal tasks. Additionally, you will fine-tune and adapt LLMs for domain-specific tasks like text generation, summarization, and semantic similarity. Experimenting with RLHF (Reinforcement Learning from Human Feedback) and alignment techniques will also be part of your role. In the realm of Deployment & Scalability (MLOps/LLMOps), you will build and maintain end-to-end ML pipelines for training, evaluation, and deployment. Deploying LLMs and deep learning models in production environments using frameworks like FastAPI, vLLM, or TensorRT is crucial. You will optimize models for low-latency, high-throughput inference and implement CI/CD workflows for ML systems using tools like MLflow and Kubeflow. Monitoring & Optimization will involve setting up logging, monitoring, and alerting for model performance metrics such as drift, latency, and accuracy. Collaborating with DevOps teams to ensure scalability, security, and cost-efficiency of deployed models will also be part of your responsibilities. The ideal candidate will possess 5-7 years of hands-on experience in Deep Learning, NLP, and LLMs. Strong proficiency in Python, PyTorch, TensorFlow, Hugging Face Transformers, and LLM frameworks is essential. Experience with model deployment tools like Docker, Kubernetes, and FastAPI, along with knowledge of MLOps/LLMOps best practices and familiarity with cloud platforms (AWS, GCP, Azure) are required qualifications. Preferred qualifications include contributions to open-source LLM projects, showcasing your commitment to advancing the field of machine learning.,
Posted 2 weeks ago
12.0 - 14.0 years
0 Lacs
Hyderabad, Telangana, India
On-site
Our vision is to transform how the world uses information to enrich life for . Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. Principal / Senior Systems Performance Engineer Micron Data Center and Client Workload Engineering in Hyderabad, India, is seeking a senior/principal engineer to join our dynamic team. The successful candidate will primarily contribute to the ML development, ML DevOps, HBM program in the data center by analyzing how AI/ML workloads perform on the latest MU-HBM, Micron main memory, expansion memory and near memory (HBM/LP) solutions, conduct competitive analysis, showcase the benefits that workloads see with MU-HBM's capacity / bandwidth / thermals, contribute to marketing collateral, and extract AI/ML workload traces to help optimize future HBM designs. Job Responsibilities: The Job Responsibilities include but are not limited to the following: Design, implement, and maintain scalable & reliable ML infrastructure and pipelines. Collaborate with data scientists and ML engineers to deploy machine learning models into production environments. Automate and optimize ML workflows, including data preprocessing, model training, evaluation, and deployment. Monitor and manage the performance, reliability, and scalability of ML systems. Troubleshoot and resolve issues related to ML infrastructure and deployments. Implement and manage distributed training and inference solutions to enhance model performance and scalability. Utilize DeepSpeed, TensorRT, vLLM for optimizing and accelerating AI inference and training processes. Understand key care abouts when it comes to ML models such as: transformer architectures, precision, quantization, distillation, attention span & KV cache, MoE, etc. Build workload memory access traces from AI models. Study system balance ratios for DRAM to HBM in terms of capacity and bandwidth to understand and model TCO. Study data movement between CPU, GPU and the associated memory subsystems (DDR, HBM) in heterogeneous system architectures via connectivity such as PCIe/NVLINK/Infinity Fabric to understand the bottlenecks in data movement for different workloads. Develop an automated testing framework through scripting. Customer engagements and conference presentations to showcase findings and develop whitepapers. Requirements: Strong programming skills in Python and familiarity with ML frameworks such as TensorFlow, PyTorch, or scikit-learn. Experience in data preparation: cleaning, splitting, and transforming data for training, validation, and testing. Proficiency in model training and development: creating and training machine learning models. Expertise in model evaluation: testing models to assess their performance. Skills in model deployment: launching server, live inference, batched inference Experience with AI inference and distributed training techniques. Strong foundation in GPU and CPU processor architecture Familiarity with and knowledge of server system memory (DRAM) Strong experience with benchmarking and performance analysis Strong software development skills using leading scripting, programming languages and technologies (Python, CUDA, C, C++) Familiarity with PCIe and NVLINK connectivity Preferred Qualifications: Experience in quickly building AI workflows: building pipelines and model workflows to design, deploy, and manage consistent model delivery. Ability to easily deploy models anywhere: using managed endpoints to deploy models and workflows across accessible CPU and GPU machines. Understanding of MLOps: the overarching concept covering the core tools, processes, and best practices for end-to-end machine learning system development and operations in production. Knowledge of GenAIOps: extending MLOps to develop and operationalize generative AI solutions, including the management of and interaction with a foundation model. Familiarity with LLMOps: focused specifically on developing and productionizing LLM-based solutions. Experience with RAGOps: focusing on the delivery and operation of RAGs, considered the ultimate reference architecture for generative AI and LLMs. Data management: collect, ingest, store, process, and label data for training and evaluation. Configure role-based access control dataset search, browsing, and exploration data provenance tracking, data logging, dataset versioning, metadata indexing, data quality validation, dataset cards, and dashboards for data visualization. Workflow and pipeline management: work with cloud resources or a local workstation connect data preparation, model training, model evaluation, model optimization, and model deployment steps into an end-to-end automated and scalable workflow combining data and compute. Model management: train, evaluate, and optimize models for production store and version models along with their model cards in a centralized model registry assess model risks, and ensure compliance with standards. Experiment management and observability: track and compare different machine learning model experiments, including changes in training data, models, and hyperparameters. Automatically search the space of possible model architectures and hyperparameters for a given model architecture analyze model performance during inference, monitor model inputs and outputs for concept drift. Synthetic data management: extend data management with a new native generative AI capability. Generate synthetic training data through domain randomization to increase transfer learning capabilities. Declaratively define and generate edge cases to evaluate, validate, and certify model accuracy and robustness. Embedding management: represent data samples of any modality as dense multi-dimensional embedding vectors generate, store, and version embeddings in a vector database. Visualize embeddings for improvised exploration. Find relevant contextual information through vector similarity search for RAGs. Education: Bachelor's or higher (with 12+ years of experience) in Computer Science or related field.
Posted 2 weeks ago
8.0 - 10.0 years
0 Lacs
Noida, Uttar Pradesh, India
Remote
Senior Manager - Senior Data Scientist (NLP & Generative AI) Location: PAN India / Remote Employment Type: Full-time About the Role We are seeking a highly experienced Senior data scientist with 8+ years of expertise in machine learning, focusing on NLP, Generative AI, and advanced LLM ecosystems. This role demands leadership in designing and deploying scalable AI systems leveraging the latest advancements such as Google ADK, Agent Engine, and Gemini LLM. You will spearhead building real-time inference pipelines and agentic AI solutions that power complex, multi-user applications with cutting-edge technology. Key Responsibilities Lead the architecture, development, and deployment of scalable machine learning and AI systems centered on real-time LLM inference for concurrent users. Design, implement, and manage agentic AI frameworks leveraging Google Adk, Langgraph or custom-built agents. Integrate foundation models (GPT, LLaMA, Claude, Gemini) and fine-tune them for domain-specific intelligent applications. Build robust MLOps pipelines for end-to-end lifecycle management of models-training, testing, deployment, and monitoring. Collaborate with DevOps teams to deploy scalable serving infrastructures using containerization (Docker), orchestration (Kubernetes), and cloud platforms. Drive innovation by adopting new AI capabilities and tools, such as Google Gemini, to enhance AI model performance and interaction quality. Partner cross-functionally to understand traffic patterns and design AI systems that handle real-world scale and complexity. Required Skills & Qualifications Bachelor's or Master's degree in Computer Science, AI, Machine Learning, or related fields. 7+ years in ML engineering, applied AI, or senior data scientist roles. Strong programming expertise in Python and frameworks including PyTorch, TensorFlow, Hugging Face Transformers. Deep experience with NLP, Transformer models, and generative AI techniques. Practical knowledge of LLM inference scaling with tools like vLLM, Groq, Triton Inference Server, and Google ADK. Hands-on experience deploying AI models to concurrent users with high throughput and low latency. Skilled in cloud environments (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes). Familiarity with vector databases (FAISS, Pinecone, Weaviate) and retrieval-augmented generation (RAG). Experience with agentic AI using Adk, LangChain, Langgraph and Agent Engine Preferred Qualifications Experience with Google Gemini and other advanced LLM innovations. Contributions to open-source AI/ML projects or participation in applied AI research. Knowledge of hardware acceleration and GPU/TPU-based inference optimization. Exposure to event-driven architectures or streaming pipelines (Kafka, Redis).
Posted 3 weeks ago
11.0 - 20.0 years
40 - 50 Lacs
Pune, Chennai, Bengaluru
Hybrid
Senior xOps Specialist AIOps, MLOps & DataOps Architect Location: Chennai, Pune Employment Type: Fulltime - Hybrid Experience Required: 12-15 years Job Summary: We are seeking a Senior xOps Specialist to architect, implement, and optimize AI-driven operational frameworks across AIOps, MLOps, and DataOps. The ideal candidate will design and enhance intelligent automation, predictive analytics, and resilient pipelines for large-scale data engineering, AI/ML deployments, and IT operations. This role requires deep expertise in AI/ML automation, data-driven DevOps strategies, observability frameworks, and cloud-native orchestration. Key Responsibilities – Design & Architecture AIOps: AI-Driven IT Operations & Automation Architect AI-powered observability platforms, ensuring predictive incident detection and autonomous IT operations. Implement AI-driven root cause analysis (RCA) for proactive issue resolution and performance optimization. Design self-healing infrastructures leveraging machine learning models for anomaly detection and remediation workflows. Establish event-driven automation strategies, enabling autonomous infrastructure scaling and resilience engineering. MLOps: Machine Learning Lifecycle Optimization Architect end-to-end MLOps pipelines, ensuring automated model training, validation, deployment, and monitoring. Design CI/CD pipelines for ML models, embedding drift detection, continuous optimization, and model explainability. Implement feature engineering pipelines, leveraging data versioning, reproducibility, and intelligent retraining techniques. Ensure secure and scalable AI/ML environments, optimizing GPU-accelerated processing and cloud-native model serving. DataOps: Scalable Data Engineering & Pipelines Architect data processing frameworks, ensuring high-performance, real-time ingestion, transformation, and analytics. Build data observability platforms, enabling automated anomaly detection, data lineage tracking, and schema evolution. Design self-optimizing ETL pipelines, leveraging AI-driven workflows for data enrichment and transformation. Implement governance frameworks, ensuring data quality, security, and compliance with enterprise standards. Automation & API Integration Develop Python or Go-based automation scripts for AI model orchestration, data pipeline optimization, and IT workflows. Architect event-driven xOps frameworks, enabling intelligent orchestration for real-time workload management. Implement AI-powered recommendations, optimizing resource allocation, cost efficiency, and performance benchmarking. Cloud-Native & DevOps Integration Embed AI/ML observability principles within DevOps pipelines, ensuring continuous monitoring and retraining cycles. Architect cloud-native solutions optimized for Kubernetes, containerized environments, and scalable AI workloads. Establish AIOps-driven cloud infrastructure strategies, automating incident response and operational intelligence. Qualifications & Skills – xOps Expertise Deep expertise in AIOps, MLOps, and DataOps, designing AI-driven operational frameworks. Proficiency in automation scripting, leveraging Python, Go, and AI/ML orchestration tools. Strong knowledge of AI observability, ensuring resilient IT operations and predictive analytics. Extensive experience in cloud-native architectures, Kubernetes orchestration, and serverless AI workloads. Ability to troubleshoot complex AI/ML pipelines, ensuring optimal model performance and data integrity. Preferred Certifications (Optional): AWS Certified Machine Learning Specialist Google Cloud Professional Data Engineer Kubernetes Certified Administrator (CKA) DevOps Automation & AIOps Certification
Posted 1 month ago
3.0 - 5.0 years
16 - 20 Lacs
Noida
Work from Office
Position Title: AI/ML Engineer Company: Cyfuture India Pvt. Ltd. Industry: IT Services and IT Consulting Location: Sector 81, NSEZ, Noida (5 Days Work From Office) Website: www.cyfuture.com About Cyfuture Cyfuture is a trusted name in IT services and cloud infrastructure, offering state-of-the-art data center solutions and managed services across platforms like AWS, Azure, and VMWare. We are expanding rapidly in system integration and managed services, building strong alliances with global OEMs like VMWare, AWS, Azure, HP, Dell, Lenovo, and Palo Alto. Position Overview We are hiring an experienced AI/ML Engineer to lead and shape our AI/ML initiatives. The ideal candidate will have hands-on experience in machine learning and artificial intelligence, with strong leadership capabilities and a passion for delivering production-ready solutions. This role involves end-to-end ownership of AI/ML projects, from strategy development to deployment and optimization of large-scale systems. Key Responsibilities Lead and mentor a high-performing AI/ML team. Design and execute AI/ML strategies aligned with business goals. Collaborate with product and engineering teams to identify impactful AI opportunities. Build, train, fine-tune, and deploy ML models in production environments. Manage operations of LLMs and other AI models using modern cloud and MLOps tools. Implement scalable and automated ML pipelines (e.g., with Kubeflow or MLRun). Handle containerization and orchestration using Docker and Kubernetes. Optimize GPU/TPU resources for training and inference tasks. Develop efficient RAG pipelines with low latency and high retrieval accuracy. Automate CI/CD workflows for continuous integration and delivery of ML systems. Key Skills & Expertise 1. Cloud Computing & Deployment Proficiency in AWS, Google Cloud, or Azure for scalable model deployment. Familiarity with cloud-native services like AWS SageMaker, Google Vertex AI, or Azure ML. Expertise in Docker and Kubernetes for containerized deployments Experience with Infrastructure as Code (IaC) using tools like Terraform or CloudFormation. 2. Machine Learning & Deep Learning Strong command of frameworks: TensorFlow, PyTorch, Scikit-learn, XGBoost. Experience with MLOps tools for integration, monitoring, and automation. Expertise in pre-trained models, transfer learning, and designing custom architectures. 3. Programming & Software Engineering Strong skills in Python (NumPy, Pandas, Matplotlib, SciPy) for ML development. Backend/API development with FastAPI , Flask , or Django . Database handling with SQL and NoSQL (PostgreSQL, MongoDB, BigQuery). Familiarity with CI/CD pipelines (GitHub Actions, Jenkins). 4. Scalable AI Systems Proven ability to build AI-driven applications at scale. Handle large datasets, high-throughput requests, and real-time inference. Knowledge of distributed computing: Apache Spark, Dask, Ray . 5. Model Monitoring & Optimization Hands-on with model compression, quantization, and pruning . A/B testing and performance tracking in production. Knowledge of model retraining pipelines for continuous learning. 6. Resource Optimization Efficient use of compute resources: GPUs, TPUs, CPUs . Experience with serverless architectures to reduce cost. Auto-scaling and load balancing for high-traffic systems. 7. Problem-Solving & Collaboration Translate complex ML models into user-friendly applications. Work effectively with data scientists, engineers, and product teams. Write clear technical documentation and architecture reports . Udisha Parashar Senior Talent Acquisition Specialist Mob: +91- 9301895707 Email: udisha.parashar@cyfuture.com URL: www.cyfuture.com
Posted 2 months ago
17 - 27 years
100 - 200 Lacs
Bengaluru
Work from Office
Senior Software Technical Director / Software Technical Director Bangalore Founded in 2023,by Industry veterans HQ in California,US We are revolutionizing sustainable AI compute through intuitive software with composable silicon We are looking for a Software Technical Director with a strong technical foundation in systems software, Linux platforms, or machine learning compiler stacks to lead and grow a high-impact engineering team in Bangalore. You will be responsible for shaping the architecture, contributing to codebases, and managing execution across projects that sit at the intersection of systems programming, AI runtimes, and performance-critical software. Key Responsibilities: Technical Leadership: Lead the design and development of Linux platform software, firmware, or ML compilers and runtimes. Drive architecture decisions across compiler, runtime, or low-level platform components. Write production-grade C++ code and perform detailed code reviews. Guide performance analysis and debugging across the full stackfrom firmware and drivers to user-level runtime libraries. Collaborate with architects, silicon teams, and ML researchers to build future-proof software stacks. Team & Project Management: Mentor and coach junior and senior engineers to grow technical depth and autonomy. Own end-to-end project planning, execution, and delivery, ensuring high-quality output across sprints/releases. Facilitate strong cross-functional communication with hardware, product, and other software teams globally. Recruit and grow a top-tier engineering team in Bangalore, contributing to the hiring strategy and team culture. Required Qualifications: Bachelors or Master’s degree in Computer Science, Electrical Engineering, or related field. 18+ years of experience in systems software development with significant time spent in C++, including architectural and hands-on roles. Proven experience in either: Linux kernel, bootloaders, firmware, or low-level platform software, or Machine Learning compilers (e.g., MLIR, TVM, Glow) or runtimes (e.g., ONNX Runtime, TensorRT, vLLM). Excellent communication skills—written and verbal. Prior experience in project leadership or engineering management with direct reports. Highly Desirable: Understanding of AI/ML compute workloads, particularly Large Language Models (LLMs). Familiarity with performance profiling, bottleneck analysis, and compiler-level optimizations. Exposure to AI accelerators, systolic arrays, or vector SIMD programming. Why Join Us? Work at the forefront of AI systems software, shaping the future of ML compilers and runtimes. Collaborate with globally distributed teams in a fast-paced, innovation-driven environment. Build and lead a technically elite team from the ground up in a growth-stage organization. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough