Jobs
Interviews

519 Quantization Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

4.0 - 8.0 years

0 Lacs

karnataka

On-site

As a potential candidate for this position, you will be responsible for contributing to cutting-edge AI/ML solutions at Goldman Sachs. Here is a breakdown of the qualifications and attributes you should possess: - A Bachelor's, Master's or PhD degree in Computer Science, Machine Learning, Mathematics, or a related field is required. - Preferably 7+ years of AI/ML industry experience for Bachelors/Masters, 4+ years for PhD, with a focus on Language Models. - Strong foundation in machine learning algorithms, including deep learning architectures like transformers, RNNs, CNNs. - Proficiency in Python and relevant libraries/frameworks such as TensorFlow, PyTorch, Hugging Face Transformers, scikit-learn. - Demonstrated expertise in GenAI techniques, including but not limited to Retrieval-Augmented Generation (RAG), model fine-tuning, prompt engineering, AI agents, and evaluation techniques. - Experience working with embedding models and vector databases. - Experience with MLOps practices, including model deployment, containerization (Docker, Kubernetes), CI/CD, and model monitoring. - Strong verbal and written communication skills. - Curiosity, ownership, and willingness to work in a collaborative environment. - Proven ability to mentor and guide junior engineers. Desirable experience that can set you apart from other candidates includes: - Experience with Agentic Frameworks (e.g., Langchain, AutoGen) and their application to real-world problems. - Understanding of scalability and performance optimization techniques for real-time inference such as quantization, pruning, and knowledge distillation. - Experience with model interpretability techniques. - Prior experience in code reviews/architecture design for distributed systems. - Experience with data governance and data quality principles. - Familiarity with financial regulations and compliance requirements. About Goldman Sachs: At Goldman Sachs, the commitment is to help clients, shareholders, and communities grow by leveraging people, capital, and ideas. Established in 1869, Goldman Sachs is a prominent global investment banking, securities, and investment management firm headquartered in New York with offices worldwide. Goldman Sachs is dedicated to fostering diversity and inclusion by providing numerous opportunities for personal and professional growth, including training, development, networks, benefits, wellness, personal finance offerings, and mindfulness programs. To learn more about the culture, benefits, and people at Goldman Sachs, visit GS.com/careers. Goldman Sachs is dedicated to providing reasonable accommodations for candidates with special needs or disabilities during the recruiting process. To learn more about accommodations, visit: https://www.goldmansachs.com/careers/footer/disability-statement.html Copyright The Goldman Sachs Group, Inc. 2023. All rights reserved.,

Posted 15 hours ago

Apply

3.0 years

1 - 2 Lacs

sān

On-site

We are building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network. We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. **Responsibilities ** - Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models - Deploy and maintain large language models at scale in production environments - Deploy new models as they are released by frontier labs - Implement techniques like quantization, speculative decoding, and KV cache reuse - Contribute regularly to open source projects such as SGLang and vLLM - Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues - Collaborate with the engineering team to bring new features and capabilities to our inference platform - Develop robust and scalable infrastructure for AI model serving - Create and maintain technical documentation for inference systems **Requirements ** - 3+ years of experience writing high-performance, production-quality code - Strong proficiency with Python and deep learning frameworks, particularly PyTorch - Demonstrated experience with LLM inference optimization techniques - Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred - Familiarity with Docker and Kubernetes for containerized deployments - Experience with CUDA programming and GPU optimization - Strong understanding of distributed systems and scalability challenges - Proven track record of optimizing AI models for production environments Nice to Have - Familiarity with TensorRT and TensorRT-LLM - Knowledge of vision models and multimodal AI systems - Experience implementing techniques like quantization and speculative decoding - Contributions to open source machine learning projects - Experience with large-scale distributed computing **Compensation **We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is

Posted 1 day ago

Apply

5.0 years

0 Lacs

mumbai, maharashtra, india

Remote

Job Title: ML Ops Engineer – GenAI & ML Solutions Location: Pune / Mumbai (Hybrid Work Model) Experience: 3–5 Years (Minimum 2 Years in AI Development) Industry: Credit Rating & Financial Analytics About Us: Join a leading global credit rating and financial analytics powerhouse where innovation meets finance! We leverage cutting-edge AI and analytics to deliver state of the art solutions. As we embark on our next growth phase, we’re looking for a passionate AI Engineer to propel our AI capabilities into the future — redefining how financial intelligence is powered and delivered. Why Join Us? Be at the forefront of AI innovation in finance technology with exposure to next-gen GenAI techniques. Collaborate with dynamic global teams spanning finance, technology, and client solutions. Work in a hybrid setup in Pune or Mumbai, blending flexibility with the spirit of teamwork. Opportunity to directly influence client-facing AI solutions impacting real-world business outcomes. Grow your career in an environment that champions best coding practices, continuous learning, and breakthrough AI deployments on cloud platforms. What You’ll Do: Develop, and manage efficient MLOps pipelines tailored for Large Language Models, automating the deployment and lifecycle management of models in production. Deploy, scale, and monitor LLM inference services across cloud-native environments using - Kubernetes, Docker, and other container orchestration frameworks. Optimize LLM serving infrastructure for latency, throughput, and cost, including hardware acceleration setups with GPUs or TPUs. Build and maintain CI/CD pipelines specifically for ML workflows, enabling automated validation, and seamless rollouts of continuously updated language models. Implement comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack) to track model performance, resource utilization, and system health. Collaborate cross-functionally with ML research and data science teams to operationalize fine-tuned models, prompt engineering experiments, and multi agentic LLM workflows. Handle integration of LLMs with APIs and downstream applications, ensuring reliability, security, and compliance with data governance standards. Evaluate, select, and incorporate the latest model-serving frameworks and tooling (e.g., Hugging Face Inference API, NVIDIA Triton Inference Server). Troubleshoot complex operational issues impacting model availability and degradation, implementing fixes and preventive measures. Stay up to date with emerging trends in LLM deployment, optimization techniques such as quantization and distillation, and evolving MLOps best practices. What We’re Looking For: Experience & Skills: 3 to 5 years of professional experience in Machine Learning Operations or ML Infrastructure engineering, including experience deploying and managing large-scale ML models. Proven expertise in containerization and orchestration technologies such as Docker and Kubernetes, with a track record of deploying ML/LLM models in production. Strong proficiency in programming with Python and scripting languages such as Bash for workflow automation. Hands-on experience with cloud platforms (AWS, Google Cloud Platform, Azure), including compute resources (EC2, GKE, Kubernetes Engine), storage, and ML services. Solid understanding of serving models using frameworks like Hugging Face Transformers or OpenAI APIs. Experience building and maintaining CI/CD pipelines tuned to ML lifecycle workflows (evaluation, deployment). Familiarity with performance optimization techniques such as batching, quantization, and mixed-precision inference specifically for large-scale transformer models. Expertise in monitoring and logging technologies (Prometheus, Grafana, ELK Stack, Fluentd) to ensure production-grade observability. Knowledge of GPU/TPU infrastructure setup, scheduling, and cost-optimization strategies. Strong problem-solving skills with the ability to troubleshoot infrastructure and deployment issues swiftly and efficiently. Effective communication and collaboration skills to work with cross-functional teams in a fast-paced environment. Educational Background: Bachelor’s or Master’s degree from premier Indian institutes (IITs, IISc, NITs, BITS, IIITs etc.) in: Computer Science, or Any Engineering discipline, or Mathematics or related quantitative fields. Benefits: Hybrid work model combining remote flexibility and collaborative office culture in Pune or Mumbai. Continuous learning budget for certifications, workshops, and conferences. Opportunities to work on industry-leading AI research and shaping the future of financial services. Step into the Future of Financial technology with AI – Apply Now! If you are eager to push boundaries of AI in financial analytics and thrive in a global, fast-paced environment, we want to hear from you!

Posted 1 day ago

Apply

0 years

0 Lacs

gurugram, haryana, india

On-site

We're Hiring: Generative AI Engineer 🔍 What You’ll Do: Fine-tune and deploy LLMs (Llama, GPT, Claude, Gemini) Design scalable ML infrastructure & real-time inference systems Optimize models for latency, safety, and business alignment Collaborate with cross-functional teams to solve real-world problems 🧠 What We’re Looking For: 6+ yrs in ML/Data Science, 1.5+ yrs in GenAI Strong in Python, PyTorch/TensorFlow, Transformers, LangChain Hands-on with LoRA, QLoRA, RLHF, quantization, distillation Experience with MLOps tools like MLflow, W&B, DVC

Posted 1 day ago

Apply

6.0 years

0 Lacs

hyderabad, telangana, india

On-site

How is this team contributing to vision of Providence? Ensure a continual and seamless service for customers regardless of which services they access. What will you be responsible for? End-to-end development and quality of solutions and services in conversational & Agentic AI capability Understanding and Implementing advanced techniques in Agentic AI & Conversational AI Frameworks. Introduce best practices in AI Frameworks. What would your day look like? Lead the design and development of Agentic AI Frameworks, models and systems, ensuring high performance and scalability. Good in Python, RestAPI and data structure skills Fine-tune and optimize pre-trained large language models (LLMs) such as GPT, LLAMA, Falcon, and Mistral. Implement and manage model quantization techniques (e.g., GPTQ, AWQ, 8-bit/4-bit models) to enhance model efficiency. Utilize Langchain, Dagster etc for orchestration and memory management in agentic applications. Conduct thorough model evaluations using metrics such as BLEU, ROUGE, and perplexity to ensure model quality and performance. Develop and maintain systems for intent recognition, entity extraction, and generative dialogue management. Stay updated with the latest advancements in NLP and AI technologies and apply them to ongoing projects. Mentor and guide junior engineers, fostering a culture of continuous learning and innovation. Who are we looking for? Minimum 6 years of experience in Programming Languages & AI Platforms Strong understanding of deep learning fundamentals, including LSTM, RNN, ANN, POS tagging, and Transformer models. Strong understanding and proficiency in Agentic AI Frameworks. Proven experience in fine-tuning pre-trained LLMs (e.g., GPT, LLAMA, Falcon, Mistral) and Quantization. Knowledge of model quantization techniques (GPTQ, AWQ, 8-bit/4-bit models). Proficiency in using orchestration frameworks, state and memory management. Experience with model evaluation metrics such as BLEU, ROUGE, and perplexity. Understanding of intent recognition, entity extraction, and generative dialogue management systems. Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment. Strong communication skills and the ability to convey complex technical concepts to non-technical stakeholders. Good Understanding with Azure AI Foundary, Azure AI search, Indexing Knowledge in Kubernetes is good to have. Knowledge of Microsoft Event Hubs or Kafka Technologies Experience with load testing and unit testing tools. Proficient understanding source control and code versioning tools such as GIT, Azure DevOps etc. Familiarity with CICD and DevOps.

Posted 1 day ago

Apply

0 years

0 Lacs

pune, maharashtra, india

Remote

neoBIM is a well-funded start-up software company revolutionizing the way architects design buildings with our innovative BIM (Building Information Modelling) software. As we continue to grow, we are building a small and talented team of developers to drive our software forward. Tasks We are looking for a highly skilled Generative AI Developer to join our AI team. The ideal candidate should have strong expertise in deep learning, large language models (LLMs), multimodal AI, and generative models (GANs, VAEs, Diffusion Models, or similar techniques) . This role offers the opportunity to work on cutting-edge AI solutions, from training models to deploying AI-driven applications that redefine automation and intelligence. Develop, fine-tune, and optimize Generative AI models , including LLMs, GANs, VAEs, Diffusion Models, and Transformer-based architectures . Work with large-scale datasets and design self-supervised or semi-supervised learning pipelines . Implement multimodal AI systems that combine text, images, audio, and structured data. Optimize AI model inference for real-time applications and large-scale deployment. Build AI-driven applications for BIM (Building Information Modeling), content generation, and automation . Collaborate with data scientists, software engineers, and domain experts to integrate AI into production. Stay ahead of AI research trends and incorporate state-of-the-art methodologies . Deploy models using cloud-based ML pipelines (AWS/GCP/Azure) and edge computing solutions . Requirements Must-Have Skills Strong programming skills in Python (PyTorch, TensorFlow, JAX, or equivalent). Experience in training and fine-tuning Large Language Models (LLMs) like GPT, BERT, LLaMA, or Mixtral . Expertise in Generative AI techniques , including Diffusion Models (e.g., Stable Diffusion, DALL-E, Imagen), GANs, VAEs . Hands-on experience with transformer-based architectures (e.g., Vision Transformers, BERT, T5, GPT, etc.) . Experience with MLOps frameworks for scaling AI applications (Docker, Kubernetes, MLflow, etc.). Proficiency in data preprocessing, feature engineering, and AI pipeline development . Strong background in mathematics, statistics, and optimization related to deep learning. Good-to-Have Skills Experience in NeRFs (Neural Radiance Fields) for 3D generative AI . Knowledge of AI for Architecture, Engineering, and Construction (AEC) . Understanding of distributed computing (Ray, Spark, or Tensor Processing Units). Familiarity with AI model compression and inference optimization (ONNX, TensorRT, quantization techniques) . Experience in cloud-based AI development (AWS/GCP/Azure) . Benefits Work on high-impact AI projects at the cutting edge of Generative AI . Competitive salary with growth opportunities. Access to high-end computing resources for AI training & development. A collaborative, research-driven culture focused on innovation & real-world impact . Flexible work environment with remote options.

Posted 2 days ago

Apply

12.0 years

0 Lacs

greater hyderabad area

On-site

Principal Staff Verification Engineer (VLSI Verification + AV +AI Expertise) Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. Job Description – Staff Verification Engineer (VLSI Verification + AV +AI Expertise) Position: Staff Verification Engineer – VLSI Verification Lead Location: Hyderabad Experience: 12+ years in Functional Verification Key Protocol Experience: MIPI DSI, DisplayPort, HDMI Role Overview We are seeking a highly skilled Staff Verification Engineer with strong expertise in VLSI functional verification and a good understanding of AI model deployment for Audio/Video applications. The candidate will lead verification efforts for complex SoCs/IPs, while also collaborating with cross-functional teams on next-generation multimedia and AI-driven system use cases. Requirements Experience: 12+ years in functional verification; minimum 5+ years in Multimedia (Display, Camera, Video, Graphics) domain . Domain Expertise: Strong knowledge in Display (Pixel processing, composition, compression, MIPI DSI, DisplayPort, HDMI) and Bus/Interconnect (AHB, AXI). Multimedia technologies: Audio/Video codecs, Image Processing, SoC system use cases (Display, Camera, Video, Graphics). Good understanding of DSP, codecs (audio/video), and real-time streaming pipelines. AI accelerators – architecture understanding, verification, and deployment experience across NPUs, GPUs, and custom AI engines. SoC system-level verification with embedded RISC/DSP processors. AI/ML Skills: Experience with AI models (ex. CNN ) and statistical modeling techniques. Exposure to audio frameworks, audio solutions, and embedded platforms. Hands-on in multimedia use cases verification and system-level scenarios. Strong exposure to MIPI DSI-2, CSI-2, MIPI D-PHY, C-PHY. Verification Expertise: Proven expertise in developing/maintaining SystemVerilog/UVM-based testbenches, UVCs, sequences, checkers, coverage models. Strong understanding of OOP concepts in verification. HVL: SystemVerilog (UVM), SystemC (preferred). HDL: Verilog, SystemVerilog. Leadership & Collaboration: Mentor and guide junior verification engineers; drive closure for IP and SoC-level deliverables. Strong written and verbal communication skills; ability to convey complex technical concepts. Proven ability to plan, prioritize, and execute effectively. Debugging & Architecture Knowledge: Excellent debug skills across SoC architecture, VIP integration, and verification flows. Responsibilities AI & Multimedia (AV) Responsibilities Develop, optimize, and deploy AI models for audio and video applications, with strong focus on inference efficiency and performance optimization across NPUs, GPUs, and CPUs. Perform model evaluation, quantization, and compression to enable fast and robust inference on embedded hardware. Collaborate with cross-functional R&D, systems, and integration teams for system use case verification and commercialization support. Evaluate system performance, debug, and optimize for robustness and efficiency. Participate in industry benchmarking and trend analysis; introduce state-of-the-art architectural and technical innovations. ASIC / SoC Verification Responsibilities Lead and contribute to feature, core, and subsystem verification during ASIC design and development phases through RTL and Gate-Level simulations. Collaborate with the design team to define verification requirements, ensuring functional, performance, and power correctness. Develop and execute comprehensive test plans and drive verification closure. Create and maintain SystemVerilog/UVM testbenches, assertions, and functional coverage models. Implement and enhance automation flows to improve verification efficiency. Participate in debug activities throughout the development cycle. Apply ASIC expertise to define, model, optimize, verify, and validate IP (block/SoC) development for high-performance, low-power products. Collaborate with software and hardware architecture teams to develop strategies meeting system-level requirements. Evaluate complete design flows from RTL through synthesis, place-and-route, timing, and power usage. Write detailed technical documentation for verification methodologies, flows, and deliverables. Contact: Uday Bhaskar Mulya Technologies "Mining the Knowledge Community" Email id : muday_bhaskar@yahoo.com

Posted 2 days ago

Apply

12.0 years

0 Lacs

greater hyderabad area

On-site

Principal Staff Verification Engineer (VLSI Verification + AV +AI Expertise) Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. Job Description – Staff Verification Engineer (VLSI Verification + AV +AI Expertise) Position: Staff Verification Engineer – VLSI Verification Lead Location: Hyderabad Experience: 12+ years in Functional Verification Key Protocol Experience: MIPI DSI, DisplayPort, HDMI Role Overview We are seeking a highly skilled Staff Verification Engineer with strong expertise in VLSI functional verification and a good understanding of AI model deployment for Audio/Video applications. The candidate will lead verification efforts for complex SoCs/IPs, while also collaborating with cross-functional teams on next-generation multimedia and AI-driven system use cases. Requirements Experience: 12+ years in functional verification; minimum 5+ years in Multimedia (Display, Camera, Video, Graphics) domain . Domain Expertise: Strong knowledge in Display (Pixel processing, composition, compression, MIPI DSI, DisplayPort, HDMI) and Bus/Interconnect (AHB, AXI). Multimedia technologies: Audio/Video codecs, Image Processing, SoC system use cases (Display, Camera, Video, Graphics). Good understanding of DSP, codecs (audio/video), and real-time streaming pipelines. AI accelerators – architecture understanding, verification, and deployment experience across NPUs, GPUs, and custom AI engines. SoC system-level verification with embedded RISC/DSP processors. AI/ML Skills: Experience with AI models (ex. CNN ) and statistical modeling techniques. Exposure to audio frameworks, audio solutions, and embedded platforms. Hands-on in multimedia use cases verification and system-level scenarios. Strong exposure to MIPI DSI-2, CSI-2, MIPI D-PHY, C-PHY. Verification Expertise: Proven expertise in developing/maintaining SystemVerilog/UVM-based testbenches, UVCs, sequences, checkers, coverage models. Strong understanding of OOP concepts in verification. HVL: SystemVerilog (UVM), SystemC (preferred). HDL: Verilog, SystemVerilog. Leadership & Collaboration: Mentor and guide junior verification engineers; drive closure for IP and SoC-level deliverables. Strong written and verbal communication skills; ability to convey complex technical concepts. Proven ability to plan, prioritize, and execute effectively. Debugging & Architecture Knowledge: Excellent debug skills across SoC architecture, VIP integration, and verification flows. Responsibilities AI & Multimedia (AV) Responsibilities Develop, optimize, and deploy AI models for audio and video applications, with strong focus on inference efficiency and performance optimization across NPUs, GPUs, and CPUs. Perform model evaluation, quantization, and compression to enable fast and robust inference on embedded hardware. Collaborate with cross-functional R&D, systems, and integration teams for system use case verification and commercialization support. Evaluate system performance, debug, and optimize for robustness and efficiency. Participate in industry benchmarking and trend analysis; introduce state-of-the-art architectural and technical innovations. ASIC / SoC Verification Responsibilities Lead and contribute to feature, core, and subsystem verification during ASIC design and development phases through RTL and Gate-Level simulations. Collaborate with the design team to define verification requirements, ensuring functional, performance, and power correctness. Develop and execute comprehensive test plans and drive verification closure. Create and maintain SystemVerilog/UVM testbenches, assertions, and functional coverage models. Implement and enhance automation flows to improve verification efficiency. Participate in debug activities throughout the development cycle. Apply ASIC expertise to define, model, optimize, verify, and validate IP (block/SoC) development for high-performance, low-power products. Collaborate with software and hardware architecture teams to develop strategies meeting system-level requirements. Evaluate complete design flows from RTL through synthesis, place-and-route, timing, and power usage. Write detailed technical documentation for verification methodologies, flows, and deliverables. Contact: Uday Bhaskar Mulya Technologies "Mining the Knowledge Community" Email id : muday_bhaskar@yahoo.com

Posted 2 days ago

Apply

0 years

0 Lacs

bengaluru east, karnataka, india

On-site

Technology->Artificial Intelligence->Computer Vision Job Overview: As a Lead Computer Vision Engineer, you will lead the development and deployment of cutting-edge computer vision models and solutions for a variety of applications including image classification, object detection, segmentation, and more. You will work closely with cross-functional teams to implement advanced computer vision algorithms, ensure the integration of AI solutions into products, and help guide the research and innovation of next-generation visual AI technologies. 2. Technical Skills: Deep Learning Frameworks: Proficiency in TensorFlow, PyTorch, or other deep learning libraries. Computer Vision Tools: Expertise in OpenCV, Dlib, and other image processing libraries. Model Deployment: Experience deploying models to production using platforms such as AWS, Google Cloud, or NVIDIA Jetson (for edge devices). Algorithms: Strong understanding of core computer vision techniques like image classification, object detection (YOLO, Faster R-CNN), image segmentation (U-Net), and feature extraction. Programming Languages: Proficient in Python, C++, and other relevant programming languages for computer vision tasks. Data Handling: Experience working with large datasets, data augmentation, and preprocessing techniques. Optimization: Skills in model optimization techniques such as pruning, quantization, and hardware acceleration (e.g., using GPUs or TPUs). trong working experience in Agile environment - Experience working and understanding of ETL / ELT, Data load process - Knowledge on Cloud Infrastructure and data source integrations - Knowledge on relational Databases - Self-motivated, be able to work independently as well as being a team player - Excellent analytical and problem-solving skills - Ability to handle and respond to multiple stakeholders and queries - Ability to prioritize tasks and update key stakeholders - Strong client service focus and willingness to respond to queries and provide deliverables within prompt timeframes.

Posted 3 days ago

Apply

3.0 years

10 - 11 Lacs

gurgaon

On-site

Job Description We aim to bring about a new paradigm in medical image diagnostics, providing intelligent, holistic, ethical, explainable, and patient-centered care. We are looking for innovative problem solvers. We want people who can empathize with the consumer, understand business problems, and design and deliver intelligent products. People who are looking to extend artificial intelligence into unexplored areas. Your primary focus will be in applying deep learning and artificial intelligence techniques to the domain of medical image analysis. Responsibilities Selecting features, building, and optimizing classifier engines using deep learning techniques. Understanding the problem and applying the suitable image processing techniques Use techniques from artificial intelligence/deep learning to solve supervised and unsupervised learning problems. Understanding and designing solutions for complex problems related to medical image analysis by using Deep Learning/Object Detection/Image Segmentation. Recommend and implement best practices around the application of statistical modeling. Create, train, test, and deploy various neural networks to solve complex problems. Develop and implement solutions to fit business problems which may include applying algorithms from a standard statistical tool, deep learning or custom algorithm development. Understanding the requirements and designing solutions and architecture in accordance with them is important. Participate in code reviews, sprint planning, and Agile ceremonies to drive high-quality deliverables. Design and implement scalable data science architectures for training, inference, and deployment pipelines. Ensure code quality, readability, and maintainability by enforcing software engineering best practices within the data science team. Optimize models for production, including quantization, pruning, and latency reduction for real-time inference. Drive the adoption of versioning strategies for models, datasets, and experiments (e.g., using MLFlow, DVC). Contribute to the architectural design of data platforms to support large-scale experimentation and production workloads. Skills and Qualifications Strong software engineering skills in Python (or other languages used in data science) with emphasis on clean code, modularity, and testability. Excellent understanding and hands-on of Deep Learning techniques such as ANN, CNN, RNN, LSTM, Transformers, VAEs etc. Must have experience with Tensorflow or PyTorch framework in building, training, testing, and deploying neural networks. Experience in solving problems in the domain of Computer Vision. Knowledge of data, data augmentation, data curation, and synthetic data generation. Ability to understand the complete problem and design the solutions that best fit all the constraints. Knowledge of the common data science and deep learning libraries and toolkits such as Keras, Pandas, Scikit-learn, Numpy, Scipy, OpenCV etc. Good applied statistical skills, such as distributions, statistical testing, regression, etc. Exposure to Agile/Scrum methodologies and collaborative development practices. Experience with the development of RESTful APIs. The knowledge of libraries like FastAPI and the ability to apply it to deep learning architectures is essential. Excellent analytical and problem-solving skills with a good attitude and keen to adapt to evolving technologies. Experience with medical image analysis will be an advantage. Experience designing and building ML architecture components (e.g., feature stores, model registries, inference servers). Solid understanding of software design patterns, microservices, and cloud-native architectures. Expertise in model optimization techniques (e.g., ONNX conversion, TensorRT, model distillation) Education: BE/B Tech MS/M Tech (will be a bonus) Experience: 3+ Years Job Type: Full-time Pay: ₹1,000,000.00 - ₹1,100,000.00 per year Application Question(s): Do you have Strong software engineering skills in Python (or other languages used in data science) with emphasis on clean code, modularity, and testability? Do you have Excellent understanding and hands-on of Deep Learning techniques such as ANN, CNN, RNN, LSTM, Transformers, VAEs etc.? Do you have Knowledge of data, data augmentation, data curation, and synthetic data generation? Do you have Knowledge of the common data science and deep learning libraries and toolkits such as Keras, Pandas, Scikit-learn, Numpy, Scipy, OpenCV etc? Do you have Experience with the development of RESTful APIs? Experience: Data science: 3 years (Required) AI/ML: 3 years (Required) Work Location: In person

Posted 4 days ago

Apply

0 years

0 Lacs

chennai, tamil nadu, india

On-site

We’re looking for a hands-on full stack who is a backend expert who can build cutting-edge AI platforms to the next level: pixel-perfect UIs, production-grade model-inference services, agentic AI workflows, and seamless integration with third-party LLMs and NLP tooling. Key Responsibilities Build core backend enhancements including APIs, security (OAuth2/JWT, rate-limiting, SecretManager), and observability (structured logging, tracing) Add CI/CD pipelines, implement test automation, configure health checks, and create SLO dashboards Develop awesome UI interfaces using React.js/Next.js, Redux/Context, Tailwind, MUI, Custom-CSS, Shadcn, and Axios Design LLM and agentic services by creating micro/mini-services that host and route to OpenAI, Anthropic, local HF models, embeddings, and RAG pipelines Implement autonomous and recursive agents that orchestrate multi-step chains using tools, memory, and planning Spin up GPU/CPU inference servers behind an API gateway for model-inference infrastructure Optimize throughput with batching, streaming, quantization, and caching using Redis/pgvector Own the NLP stack by leveraging transformers for classification, extraction, and embedding generation Build data pipelines that integrate aggregated business metrics with model telemetry for analytics Mentor juniors to support learning and professional development Tech You’ll Touch Fullstack/Backend, Infra Python(or NodeJs), FastAPI, Starlette, Pydantic Async SQLAlchemy, Postgres, Alembic, pgvector Docker, Kubernetes or ECS/Fargate - AWS (Or) GCP Redis/RabbitMQ/Celery (jobs & caching) Prometheus, Grafana, OpenTelemetry If you are a fullstack, then - react.js/next.js/shadcn/tailwind.css/MUI AI/NLP HuggingFace Transformers, LangChain / Llama-Index, Torch / TensorRT OpenAI, Anthropic, Azure OpenAI, Cohere APIs Vector search (Pinecone, Qdrant, PGVector) Tooling- Pytest, GitHub Actions and Terraform/CDK preferred Why does this role matter? We are growing. We have projects & products that have different challenges This role will expand into learning & contribution in parallel to succeed together If you are an engineer who’s looking for the next leap in challenges to set your career on a rocket trajectory - this is an apt role You will work in the Founder’s office, replicate the founder & grow organically You will close these gaps while leading all future AI service development Hiring Process Short call → assignment → live coding/system design 1.5 hour → team fit: 30 minutes →offer About Company: C4Scale builds innovative products for startups, enterprises, and planet-scale companies, taking systems from zero to MVP and from MVP to scale. Our expertise spans B2B SaaS products, WebApps, AI models, chatbots, AI-driven applications, and LLM-based solutions, developed for multiple global clients. Recognized as one of the “10 Most Promising SaaS Startups – 2023” by CIOTechOutlook magazine, we take pride in driving impactful solutions. Our founder previously led data at a leading ride-hailing super-app serving over 300 million consumers, enhancing mobility experiences across Southeast Asia. With 10+ AI and SaaS projects delivered across the USA, Ireland, Saudi Arabia, Indonesia, and India, C4Scale is building the future of intelligent products- leveraging deep learning, generative AI, cloud SaaS, and product engineering to deliver innovation at speed and scale.

Posted 4 days ago

Apply

3.0 years

6 Lacs

puducherry

On-site

Job Specification: AI Platform Engineer About the Role We are seeking an AI Platform Engineer to build and scale the infrastructure that powers our production AI services. You will take cutting-edge models—ranging from speech recognition (ASR) to large language models (LLMs)—and deploy them into highly available, developer-friendly APIs. You will be responsible for creating the bridge between the R&D team, who train models, and the applications that consume them. This means developing robust APIs, deploying and optimizing models on Triton Inference Server (or similar frameworks), and ensuring real-time, scalable inference. Responsibilities ● API Development ○ Design, build, and maintain production-ready APIs for speech, language, and other AI models. ○ Provide SDKs and documentation to enable easy developer adoption. ● Model Deployment ○ Deploy models (ASR, LLM, and others) using Triton Inference Server or similar systems. ○ Optimize inference pipelines for low-latency, high-throughput workloads. ● Scalability & Reliability ○ Architect infrastructure for handling large-scale, concurrent inference requests. ○ Implement monitoring, logging, and auto-scaling for deployed services. ● Collaboration ○ Work with research teams to productionize new models. ○ Partner with application teams to deliver AI functionality seamlessly through APIs. ● DevOps & Infrastructure ○ Automate CI/CD pipelines for models and APIs. ○ Manage GPU-based infrastructure in cloud or hybrid environments. Requirements ● Core Skills ○ Strong programming experience in Python (FastAPI, Flask) and/or Go/Node.js for API services. ○ Hands-on experience with model deployment using Triton Inference Server, TorchServe, or similar. ○ Familiarity with both ASR frameworks and LLM frameworks (Hugging Face Transformers, TensorRT-LLM, vLLM, etc.). ● Infrastructure ○ Experience with Docker, Kubernetes, and managing GPU-accelerated workloads. ○ Deep knowledge of real-time inference systems (REST, gRPC, WebSockets, streaming). ○ Cloud experience (AWS, GCP, Azure). ● Bonus ○ Experience with model optimization (quantization, distillation, TensorRT, ONNX). ○ Exposure to MLOps tools for deployment and monitoring Job Types: Full-time, Permanent Pay: From ₹50,000.00 per month Experience: total work: 3 years (Preferred) Work Location: In person

Posted 5 days ago

Apply

6.0 years

3 - 4 Lacs

hyderābād

On-site

About Providence Providence, one of the US’s largest not-for-profit healthcare systems, is committed to high quality, compassionate healthcare for all. Driven by the belief that health is a human right and the vision, ‘Health for a better world’, Providence and its 121,000 caregivers strive to provide everyone access to affordable quality care and services. Providence has a network of 51 hospitals, 1,000+ care clinics, senior services, supportive housing, and other health and educational services in the US. Providence India is bringing to fruition the transformational shift of the healthcare ecosystem to Health 2.0. The India center will have focused efforts around healthcare technology and innovation, and play a vital role in driving digital transformation of health systems for improved patient outcomes and experiences, caregiver efficiency, and running the business of Providence at scale. Why Us? Best In-class Benefits Inclusive Leadership Reimagining Healthcare Competitive Pay Supportive Reporting Relation How is this team contributing to vision of Providence? Ensure a continual and seamless service for customers regardless of which services they access. What will you be responsible for? End-to-end development and quality of solutions and services in conversational & Agentic AI capability Understanding and Implementing advanced techniques in Agentic AI & Conversational AI Frameworks. Introduce best practices in AI Frameworks. What would your day look like? Lead the design and development of Agentic AI Frameworks, models and systems, ensuring high performance and scalability. Good in Python, RestAPI and data structure skills Fine-tune and optimize pre-trained large language models (LLMs) such as GPT, LLAMA, Falcon, and Mistral. Implement and manage model quantization techniques (e.g., GPTQ, AWQ, 8-bit/4-bit models) to enhance model efficiency. Utilize Langchain, Dagster etc for orchestration and memory management in agentic applications. Conduct thorough model evaluations using metrics such as BLEU, ROUGE, and perplexity to ensure model quality and performance. Develop and maintain systems for intent recognition, entity extraction, and generative dialogue management. Stay updated with the latest advancements in NLP and AI technologies and apply them to ongoing projects. Mentor and guide junior engineers, fostering a culture of continuous learning and innovation. Who are we looking for? Minimum 6 years of experience in Programming Languages & AI Platforms Strong understanding of deep learning fundamentals, including LSTM, RNN, ANN, POS tagging, and Transformer models. Strong understanding and proficiency in Agentic AI Frameworks. Proven experience in fine-tuning pre-trained LLMs (e.g., GPT, LLAMA, Falcon, Mistral) and Quantization. Knowledge of model quantization techniques (GPTQ, AWQ, 8-bit/4-bit models). Proficiency in using orchestration frameworks, state and memory management. Experience with model evaluation metrics such as BLEU, ROUGE, and perplexity. Understanding of intent recognition, entity extraction, and generative dialogue management systems. Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment. Strong communication skills and the ability to convey complex technical concepts to non-technical stakeholders. Good Understanding with Azure AI Foundary, Azure AI search, Indexing Knowledge in Kubernetes is good to have. Knowledge of Microsoft Event Hubs or Kafka Technologies Experience with load testing and unit testing tools. Proficient understanding source control and code versioning tools such as GIT, Azure DevOps etc. Familiarity with CICD and DevOps. Providence’s vision to create ‘Health for a Better World’ aids us to provide a fair and equitable workplace for all in our employment, whether temporary, part-time or full time, and to promote individuality and diversity of thought and background, and acknowledge its role in the organization’s success. This makes us committed towards equal employment opportunities, regardless of race, religion or belief, color, ancestry, disability, marital status, gender, sexual orientation, age, nationality, ethnic origin, pregnancy, or related needs, mental or sensory disability, HIV Status, or any other category protected by applicable law. In furtherance to our mission in building a more inclusive and equitable environment, we shall, from time to time, undertake programs to assist, uplift and empower underrepresented groups including but not limited to Women, PWD (Persons with Disabilities), LGTBQ+ (Lesbian, Gay, Transgender, Bisexual or Queer), Veterans and others. We strive to address all forms of discrimination or harassment and provide a safe and confidential process to report any misconduct. Contact our Integrity hotline also, read our Code of Conduct.

Posted 5 days ago

Apply

4.0 - 9.0 years

30 - 45 Lacs

hyderabad

Work from Office

Role Overview: The Staff Engineer will be responsible for architecting and implementing advanced quantization algorithms for edge AI applications. You will lead technical initiatives, mentor junior team members, and drive continuous improvement in model compression and optimization techniques for LLMs and other deep learning models. Key Responsibilities: Architectural Leadership: o Design and develop robust quantization strategies and algorithms for AI inference on edge devices. o Lead system-level design discussions and collaborate closely with hardware and research teams. Mentorship & Code Review: o Mentor mid-level and junior engineers, providing technical guidance and best practices. o Conduct thorough code reviews and ensure high standards of quality and performance. Innovation & Optimization: o Stay abreast of the latest research in model quantization and compression, and drive the adoption of innovative techniques. o Develop and maintain performance benchmarks, and continuously optimize algorithms for low latency and high energy efficiency. Cross-Functional Collaboration: o Work with the Quantizer Group Manager and Tech Lead to align technical roadmaps with product objectives. o Participate in regular strategy sessions to set technical direction and priorities. Qualifications: Bachelors or Masters degree in Computer Science, Electrical Engineering, or a related field (Ph.D. is a plus). 5-8+ years of industry experience in deep learning, model optimization, or related areas. Demonstrated experience with quantization techniques, LLM optimization, and software development using Python/C++. Strong problem-solving skills and a passion for innovation in edge AI technologies. What We Offer: An opportunity to work on pioneering edge AI technologies that redefine the future of real-time inference. A collaborative environment where innovation is at the core of our culture. Competitive compensation, comprehensive benefits, and significant opportunities for professional growth.

Posted 5 days ago

Apply

3.0 - 5.0 years

0 Lacs

gurugram, haryana, india

On-site

Job Description We aim to bring about a new paradigm in medical image diagnostics; providing intelligent, holistic, ethical, explainable and patient centric care. We are looking for innovative problem solvers. We want people who can empathize with the consumer, understand business problems, and design and deliver intelligent products. People who are looking to extend artificial intelligence into unexplored areas. Your primary focus will be in applying deep learning and artificial intelligence techniques to the domain of medical image analysis. Responsibilities Selecting features, building and optimizing classifier engines using deep learning techniques. Understanding the problem and applying the suitable image processing techniques Use techniques from artificial intelligence/deep learning to solve supervised and unsupervised learning problems. Understanding and designing solutions for complex problems related to medical image analysis by using Deep Learning/Object Detection/Image Segmentation. Recommend and implement best practices around the application of statistical modeling. Create, train, test, and deploy various neural networks to solve complex problems. Develop and implement solutions to fit business problems which may include applying algorithms from a standard statistical tool, deep learning or custom algorithm development. Understanding the requirements and designing solutions and architecture in accordance with them. Participate in code reviews, sprint planning, and Agile ceremonies to drive high-quality deliverables. Design and implement scalable data science architectures for training, inference, and deployment pipelines. Ensure code quality, readability, and maintainability by enforcing software engineering best practices within the data science team. Optimize models for production, including quantization, pruning, and latency reduction for real-time inference. Drive the adoption of versioning strategies for models, datasets, and experiments (e.g., using MLFlow, DVC). Contribute to the architectural design of data platforms to support large-scale experimentation and production workloads. Skills and Qualifications Strong software engineering skills in Python (or other languages used in data science) with emphasis on clean code, modularity, and testability. Excellent understanding and hands-on of Deep Learning techniques such as ANN, CNN, RNN, LSTM, Transformers, VAEs etc. Must have experience with Tensorflow or PyTorch framework in building, training, testing, and deploying neural networks. Experience in solving problems in the domain of Computer Vision. Knowledge of data, data augmentation, data curation, and synthetic data generation. Ability to understand the complete problem and design the solutions that best fit all the constraints. Knowledge of the common data science and deep learning libraries and toolkits such as Keras, Pandas, Scikit-learn, Numpy, Scipy, OpenCV etc. Good applied statistical skills, such as distributions, statistical testing, regression, etc. Exposure to Agile/Scrum methodologies and collaborative development practices. Experience with the development of RESTful APIs. The knowledge of libraries like FastAPI and the ability to apply it to deep learning architectures is essential. Excellent analytical and problem-solving skills with a good attitude and keen to adapt to evolving technologies. Experience with medical image analysis will be an advantage. Experience designing and building ML architecture components (e.g., feature stores, model registries, inference servers). Solid understanding of software design patterns, microservices, and cloud-native architectures. Expertise in model optimization techniques (e.g., ONNX conversion, TensorRT, model distillation) Familiarization with Triton. Education: BE/B Tech MS/M Tech (will be a bonus) Experience: 3-5 Years

Posted 6 days ago

Apply

50.0 years

0 Lacs

gurugram, haryana, india

On-site

About Us At Digilytics, we build and deliver easy to use AI products to the secured lending and consumer industry sectors. In an ever-crowded world of clever technology solutions looking for a problem to solve, our solutions start with a keen understanding of what creates and what destroys value in our clients business. Founded by Arindom Basu, the leadership of Digilytics is deeply rooted in leveraging disruptive technology to drive profitable business growth. With over 50 years of combined experience in technology-enabled change, the Digilytics leadership is focused on building a values-first firm that will stand the test of time. We are currently focussed on developing a product, Digilytics RevEL, to revolutionise loan origination for secured lending covering mortgages, motor and business lending. The product leverages the latest AI techniques to process loan application and loan documents to deliver improved customer and colleague experience, while improving productivity and throughput and reducing processing costs. About The Role Digilytics is pioneering the development of intelligent mortgage solutions in International and Indian markets. We are looking for Data Scientist who has strong NLP and computer vision expertise. We are looking for experienced data scientists, who have the aspirations and appetite for working in a start-up environment, and with relevant industry experience to make a significant contribution to our DigilyticsTM platform and solutions. Primary focus would be to apply machine learning techniques for data extraction from documents from variety of formats including scans and handwritten documents. Responsibilities Develop a learning model for high accuracy extraction and validation of documents, e.g. in mortgage industry Work with state-of-the-art language modelling approaches such as transformer-based architectures while integrating capabilities across NLP, computer vision, and machine learning to build robust multi-modal AI solutions Understand the DigilyticsTM vision and help in creating and maintaining a development roadmap Interact with clients and other team members to understand client-specific requirements of the platform Contribute to platform development team and deliver platform releases in a timely manner Liaise with multiple stakeholders and coordinate with our onshore and offshore entities Evaluate and compile the required training datasets from internal and public sources and contribute to the data pre-processing phase. Expected And Desired Skills Either of the following Deep learning frameworks PyTorch (preferred) or Tensorflow Good understanding in designing, developing, and optimizing Large Language Models (LLMs), with hands-on experience in leveraging cutting-edge advancements in NLP and generative AI Skilled in customizing LLMs for domain-specific applications through advanced fine-tuning, prompt engineering, and optimization strategies such as LoRA, quantization, and distillation. Knowledge of model versioning, serving, and monitoring using tools like MLflow, FastAPI, Docker, vLLM. Python used for analytics applications including data pre-processing, EDA, statistical analysis, machine learning model performance evaluation and benchmarking Good scripting and programming skills to integrate with other external applications Good interpersonal skills and the ability to communicate and explain models Ability to work in unfamiliar business areas and to use your skills to create solutions Ability to both work in and lead a team and to deliver and accept peer review Flexible approach to working environment and hours Experience Between 4-6 years of relevant experience Hands-on experience with Python and/or R Machine Learning Deep Learning (desirable) End to End development of a Deep Learning based model covering model selection, data preparation, training, hyper-parameter optimization, evaluation, and performance reporting. Proven experience working in both smaller and larger organisations having multicultural exposure Domain and industry experience by serving customers in one or more of these industries - Financial Services, Professional Services, other Consumer Industries Education Background A Bachelors degree in the fields of study such as Computer Science, Mathematics, Statistics, and Data Science with strong programming content from a leading institute An advanced degree such as a Master's or PhD is an advantage (ref:hirist.tech)

Posted 6 days ago

Apply

35.0 years

0 Lacs

bengaluru, karnataka, india

On-site

Description We are looking for a skilled and motivated Software Engineer with a strong foundation in C++, Python, Computer Vision, and Deep Learning, along with proven experience in Edge AI deployment using platforms such as OpenVINO, NVIDIA Jetson, and Qualcomm Snapdragon. This role is ideal for individuals passionate about implementing cutting-edge AI solutions, optimizing models for real-world performance, and deploying them on embedded or edge devices. You will be responsible for designing and developing efficient and scalable AI algorithms, implementing them in C++, and deploying them across various platforms including cloud, edge devices, and Android. The ideal candidate should be self-driven, capable of working independently, and eager to stay updated with the latest trends in AI and deep learning research. Key Responsibilities Read, interpret, and apply Deep Learning research papers to real-world problems. Design, implement, and optimize AI algorithms primarily in C++ with support in Python. Train, validate, and evaluate deep learning models for computer vision applications. Port and optimize AI models for deployment on Edge devices like NVIDIA Jetson, Intel OpenVINO, and Snapdragon-based platforms. Develop production-quality software with clean, reusable, and efficient code. Deploy AI solutions to edge devices and cloud infrastructure. Perform benchmarking and testing of AI models to ensure accuracy, speed, and robustness. Collaborate with cross-functional teams to integrate software components. Maintain code versioning and workflow using Git. Write high-quality technical documentation for models, APIs, and deployment processes. Work in a Linux CLI environment for development and deployment tasks. Demonstrate initiative in learning and adapting to new technologies as project needs evolve. Skills Required Programming Languages: Strong proficiency in C++ and Python. Core Concepts: Solid understanding of Algorithms and Data Structures. Deep Learning Frameworks: Experience with TensorFlow, PyTorch, or similar. Computer Vision: Experience in building and deploying vision-based AI models. Edge Deployment: Practical experience with OpenVINO, Jetson (JetPack, TensorRT), or Snapdragon Neural Processing SDK. Development Tools: Familiarity with Git, Linux CLI, and CMake. Optimization: Knowledge of performance profiling, code optimization, and model quantization/pruning. Documentation: Ability to prepare clear, structured technical documentation and reports. Candidate Profile 35 years of experience in AI/ML/Computer Vision-based product development. Strong analytical and problem-solving skills. Ability to work independently with minimal supervision. Excellent communication and team collaboration abilities. Passionate about AI, embedded systems, and real-time applications. (ref:hirist.tech)

Posted 6 days ago

Apply

5.0 years

0 Lacs

india

On-site

About the Role We are hiring an AI Engineer specializing in LLMs & Interactive AI to own the narrative engine, content generation, safety/moderation systems, and personalization logic for a next-generation consumer AI product. The role focuses on building lightweight, optimized language models that generate safe, structured, and engaging interactions, deployable on both edge devices and the cloud. Responsibilities Design and implement lightweight LLM pipelines (Phi-2, TinyLLaMA, Mistral-small, etc.) for structured, interactive outputs. Build branching narrative/interaction flows with deterministic continuation and constrained decoding. Develop moderation and safety layers : rule-based filters, toxicity detection (Detoxify), and fallback to external moderation APIs (e.g., GPT/Claude). Apply content filtering and vocabulary controls to enforce safe, compliant outputs. Implement heuristics-based personalization and progress tracking for adaptive experiences. Optimize models for offline-first inference (quantization, ONNX, GGUF/ggml) with cloud fallback when needed. Build internal tools for prompt/template management, content testing, and moderation review. Work closely with the Speech AI Engineer to integrate text generation ↔ voice pipeline seamlessly. Define observability metrics (latency, generation success, moderation outcomes). Qualifications 5+ years of professional experience in applied AI/NLP, with at least 3 years focused on LLMs or conversational AI . Strong background in language models : fine-tuning, LoRA, adapter methods, prompt engineering. Proficiency in PyTorch ; experience serving models with FastAPI/Docker . Experience with edge optimization : quantization, batching, caching. Knowledge of structured AI outputs (JSON schemas, branching graphs). Hands-on with content moderation systems and multi-layer safety approaches. Understanding of privacy and compliance requirements for consumer AI apps. Nice to Have Experience with dialogue systems or conversational assistants. Exposure to personalization/recommendation systems . Familiarity with educational/narrative applications (general, not product-specific). Graduated from IITs/NITs/Tier 1 Colleges

Posted 6 days ago

Apply

5.0 years

0 Lacs

india

On-site

About the Role We are hiring an AI Engineer specializing in Speech & Voice Systems to own the design and optimization of speech recognition, text-to-speech, and wake-word detection pipelines for a next-generation consumer AI product. The role focuses on delivering natural, low-latency voice interactions that work reliably on constrained hardware (edge devices) and scale seamlessly with cloud infrastructure. Responsibilities Develop and optimize speech-to-text (STT) systems (e.g., Whisper, Vosk) for low-latency recognition. Implement and enhance text-to-speech (TTS) systems (Coqui, Piper, VITS), including multi-voice support and style variations. Prototype and refine wake-word detection and integrate noise suppression, VAD, AGC, and audio normalization for robust performance. Apply model optimization techniques (quantization, ONNX, CTranslate2, GGUF/ggml) for offline/CPU-first inference. Design caching, streaming, and batching strategies to meet real-time performance targets (<2s response) . Collaborate with backend engineers to expose APIs (/stt, /tts, /wakeword) for integration into apps. Monitor performance via observability dashboards (Prometheus/Grafana, OpenTelemetry). Ensure privacy-first design : local-first processing with optional cloud fallback. Qualifications 5+ years of professional experience in applied AI, with at least 3 years focused on speech technologies . Proven experience building production-grade STT/TTS systems . Strong knowledge of audio/DSP fundamentals : resampling, denoising, VAD, loudness normalization. Proficiency in Python (PyTorch) ; experience with FastAPI/Docker for model serving. Familiarity with wake-word frameworks (Porcupine, Snowboy) and streaming audio integration. Track record of delivering low-latency speech systems optimized for edge devices. IIT/NIT/Tier 1 Colleges Preferred Nice to Have Experience with multilingual voice systems . Familiarity with real-time streaming architectures (WebRTC, gRPC). Exposure to IoT/edge deployment .

Posted 6 days ago

Apply

3.0 years

0 Lacs

andhra pradesh, india

On-site

Key Responsibilities Develop, deploy, and maintain Python-based applications powered by LLMs and other Gen AI models (e.g., GPT4, Claude, Gemini, Mistral). Implement RAG (Retrieval-Augmented Generation) pipelines using vector databases (e.g., Pinecone, FAISS, Weaviate). Build prompt templates, manage system prompts, and experiment with prompt engineering techniques for performance tuning. Integrate LLMs with APIs, microservices, databases, and user-facing apps. Optimize latency, throughput, and cost-efficiency for inference workloads, including caching, batching, or quantization where applicable. Collaborate with ML and DevOps teams to enable CI/CD, model versioning, and monitoring for AI applications. Work with evaluation frameworks (e.g., LangChain, Weights & Biases, Trulens) to measure output quality (e.g., relevance, coherence, safety). Troubleshoot and enhance LLM behavior, including addressing hallucinations, bias, and task completion accuracy. Required Qualifications Bachelors or Masters degree in Computer Science, AI/ML, Engineering, or a related field. 3+ years of experience in Python software development, preferably with AI/ML projects. Strong knowledge of LLMs, transformers, and modern NLP tools. Experience working with LLM APIs (OpenAI, Anthropic, Google Gemini, Cohere, etc.). Familiarity with vector databases (e.g., FAISS, Pinecone, Chroma) and semantic search principles. Hands-on experience with prompt engineering and LLM orchestration tools (LangChain, LlamaIndex, Haystack, etc.). Solid understanding of REST APIs, containers (Docker), and cloud platforms (AWS/GCP/Azure). Preferred Qualifications Experience with chatbot frameworks, agent-based LLM systems, or multi-agent orchestration. Understanding of tokenization, context window management, and cost optimization strategies for LLMs. Exposure to multimodal AI (e.g., text-to-image, speech-to-text, OCR) and computer vision libraries. Familiarity with evaluation techniques for AI output (BLEU, ROUGE, semantic similarity, etc.). Contributions to open-source LLM or Gen AI frameworks are a strong plus.

Posted 1 week ago

Apply

5.0 years

0 Lacs

ahmedabad, gujarat, india

On-site

Job purpose: Design, develop, and deploy end-to-end AI/ML systems, focusing on large language models (LLMs), prompt engineering, and scalable system architecture. Leverage technologies such as Java/Node.js/NET to build robust, high-performance solutions that integrate with enterprise systems. Who You Are: Education: Bachelors or Masters degree in Computer Science, Engineering, or a related field. PhD is a plus. 5+ years of experience in AI/ML development, with at least 2 years working on LLMs or NLP. Proven expertise in end-to-end system design and deployment of production-grade AI systems. Hands-on experience with Java/Node.js/.NET for backend development. Proficiency in Python and ML frameworks (TensorFlow, PyTorch, Hugging Face Transformers). Key Responsibilities: 1. Model Development & Training: Design, train, and fine-tune large language models (LLMs) for tasks such as natural language understanding, generation, and classification. Implement and optimize machine learning algorithms using frameworks like TensorFlow, PyTorch, or Hugging Face. 2. Prompt Engineering: Craft high-quality prompts to maximize LLM performance for specific use cases, including chatbots, text summarization, and question-answering systems. Experiment with prompt tuning and few-shot learning techniques to improve model accuracy and efficiency. 3. End-to-End System Design: Architect scalable, secure, and fault-tolerant AI/ML systems, integrating LLMs with backend services and APIs. Develop microservices-based architectures using Java/Node.js/.NET for seamless integration with enterprise applications. Design and implement data pipelines for preprocessing, feature engineering, and model inference. 4. Integration & Deployment: Deploy ML models and LLMs to production environments using containerization (Docker, Kubernetes) and cloud platforms (AWS/Azure/GCP). Build RESTful or GraphQL APIs to expose AI capabilities to front-end or third-party applications. 5. Performance Optimization: Optimize LLMs for latency, throughput, and resource efficiency using techniques like quantization, pruning, and model distillation. Monitor and improve system performance through logging, metrics, and A/B testing. 6. Collaboration & Leadership: Work closely with data scientists, software engineers, and product managers to align AI solutions with business objectives. Mentor junior engineers and contribute to best practices for AI/ML development. What will excite us: Strong understanding of LLM architectures and prompt engineering techniques. Experience with backend development using Java/Node.js (Express)/.NET Core. Familiarity with cloud platforms (AWS, Azure, GCP) and DevOps tools (Docker, Kubernetes, CI/CD). Knowledge of database systems (SQL, NoSQL) and data pipeline tools (Apache Kafka, Airflow). Strong problem-solving and analytical skills. Excellent communication and teamwork abilities. Ability to work in a fast-paced, collaborative environment. What will excite you: Lead AI innovation in a fast-growing, technology-driven organization. Work on cutting-edge AI solutions, including LLMs, autonomous AI agents, and Generative AI applications. Engage with top-tier enterprise clients and drive AI transformation at scale. Location: Ahmedabad

Posted 1 week ago

Apply

4.0 years

0 Lacs

noida, uttar pradesh, india

On-site

About RocketFrog.ai: RocketFrog.ai is an AI Studio for Business, delivering deep science and real-world impact through cutting-edge AI solutions. We specialize in Agentic AI, Deep Learning, and full-stack AI-first product development across Healthcare, Pharma, BFSI, and Hi-Tech industries. 🚀 Ready to take a Rocket Leap with Science? Role Overview: We are seeking a high-potential Machine Learning Engineer with 2–4 years of experience and a strong academic background to join our AI innovation team. You’ll design and implement ML models that drive intelligent automation and AI-first products for enterprise-grade use cases. Key Responsibilities: Design, develop, and optimize machine learning models for NLP, computer vision, or multi-modal applications Build scalable pipelines for data preprocessing, training, and evaluation Translate academic research and ideas into efficient, production-ready implementations Explore and research relevant papers to propose plausible approaches for solving business problems Apply optimization techniques such as quantization, pruning, and distillation Collaborate with product, engineering, and domain teams to align ML systems with business needs Stay updated with the latest developments in ML frameworks, tooling, and deployment strategies Required Skills & Expertise: Strong knowledge of Linear Algebra, Probability, and Statistics Machine Learning Proficiency: Hands-on experience with supervised, unsupervised, and deep learning methods (CNNs, RNNs, Transformers) Model Training Workflows: Familiarity with training-validation pipelines, k-fold cross-validation, early stopping, and hyperparameter tuning Evaluation Metrics: Experience evaluating models using confusion matrix, PR curves, F1 score, accuracy, and ROC-AUC Frameworks: Proficient in PyTorch with strong coding practices (clean, modular, testable) Tools: Experience with model libraries and hubs such as Hugging Face Transformers, TorchVision, etc. Data Handling: Ability to manage large datasets, custom data loaders, and data augmentation workflows Deployment: Experience with model packaging (TorchScript, ONNX) and serving (FastAPI, TorchServe) Experimentation & Reproducibility: Familiarity with tools like Weights & Biases, MLflow, or equivalent Desirable Skills: Experience with seq2seq models, LLM fine-tuning, or multi-modal architectures Hands-on with knowledge distillation and other model compression techniques Exposure to Agentic AI or orchestration frameworks like LangChain or LangGraph Understanding of vector databases, embedding stores, and RAG pipelines Contributions to open-source ML projects or academic research Cloud AI Platforms: Preference for experience with Amazon SageMaker , Azure AI Services , Google Vertex AI , or Azure AI Foundry What We Look For: Strong foundations in mathematics, algorithmic thinking, and ML system design Curiosity-driven mindset and a passion for solving complex problems Ability to convert ideas into scalable and efficient solutions Desire to learn and grow in a high-performance, innovation-first environment Clear communication and collaborative problem-solving skills Required Education Background: B.Tech / M.Tech / MS from IISc, IITs, Top NITs, BITS Pilani, IIIT-H, DTU, NSUT , or equivalent Tier-1 institutions. Why Join RocketFrog.ai? 🚀 Be part of the growth journey of RocketFrog.ai! Work in an AI-driven company that is making a real business impact. Enjoy a dynamic, fast-paced environment with career growth opportunities. Apply now and be part of our journey to shape the future of AI 🚀

Posted 1 week ago

Apply

7.0 years

0 Lacs

hyderabad, telangana, india

On-site

We are seeking an experienced AI Architect to lead the design, development, and deployment of large-scale AI solutions. The ideal candidate will bridge the gap between business requirements and technical implementation, with deep expertise in generative AI and modern MLOps practices. Key Responsibilities AI Solution Design & Implementation Architect end-to-end AI systems leveraging large language models and generative AI technologies Design scalable, production-ready AI applications that meet business objectives and performance requirements Evaluate and integrate LLM APIs from leading providers (OpenAI, Anthropic Claude, Google Gemini, etc.) Establish best practices for prompt engineering, model selection, and AI system optimization Model Development & Fine-tuning Fine-tune open-source models (Llama, Mistral, etc.) for specific business use cases Implement custom training pipelines and evaluation frameworks Optimize model performance, latency, and cost for production environments Stay current with latest model architectures and fine-tuning techniques Infrastructure & Deployment Deploy and manage AI models at enterprise scale using containerization (Docker) and orchestration (Kubernetes) Build robust, scalable APIs using FastAPI and similar frameworks Design and implement MLOps pipelines for model versioning, monitoring, and continuous deployment Ensure high availability, security, and performance of AI systems in production Business & Technical Leadership Collaborate with stakeholders to understand business problems and translate them into technical requirements Provide technical guidance and mentorship to development teams Conduct feasibility assessments and technical due diligence for AI initiatives Create technical documentation, architectural diagrams, and implementation roadmaps Required Qualifications Experience 7+ years of experience in machine learning engineering or data science Proven track record of delivering large-scale ML solutions Technical Skills Expert-level proficiency with LLM APIs (OpenAI, Claude, Gemini, etc.) Hands-on experience fine-tuning transformer models (Llama, Mistral, etc.) Strong proficiency in FastAPI, Docker, and Kubernetes Experience with ML frameworks (PyTorch, TensorFlow, Hugging Face Transformers) Proficiency in Python and modern software development practices Experience with cloud platforms (AWS, GCP, or Azure) and their AI/ML services Core Competencies Strong understanding of transformer architectures, attention mechanisms, and modern NLP techniques Experience with MLOps tools and practices (model versioning, monitoring, CI/CD) Ability to translate complex business requirements into technical solutions Strong problem-solving skills and architectural thinking Preferred Qualifications Experience with vector databases and retrieval-augmented generation (RAG) systems Knowledge of distributed training and model parallelization techniques Experience with model quantization and optimization for edge deployment Familiarity with AI safety, alignment, and responsible AI practices Experience in specific domains (finance, healthcare, legal, etc.) Advanced degree in Computer Science, AI/ML, or related field

Posted 1 week ago

Apply

0 years

0 Lacs

rewa, madhya pradesh, india

On-site

About Us : At Signet Arms, we are at the forefront of AI innovation, developing cutting-edge solutions to solve complex challenges. As part of our ongoing growth, we are looking for a skilled AI Engineer with expertise in Large Language Models (LLMs), AI agent deployment, and frontend deployment. If you have a passion for building and deploying AI-powered solutions with seamless integrations, this is your opportunity to make a real impact. Job Description: We are seeking an AI Engineer who will take on the responsibility of developing wrappers, fine-tuning, and deploying AI agents using advanced LLMs (e.g., GPT, BERT). In addition, the ideal candidate will have hands-on experience with frontend deployment, ensuring that our AI solutions are integrated effectively with user interfaces for a smooth user experience. You will play a crucial role in bridging the backend AI models with user-facing applications, making sure that everything runs seamlessly both behind the scenes and in the user interface. Key Responsibilities : Develop wrappers for large language models (LLMs) to enable seamless integration with various applications. Fine-tune pre-trained LLMs to improve performance for specific tasks and business use cases. Deploy AI agents in production environments, ensuring scalability, reliability, and performance. Collaborate with cross-functional teams to identify requirements and design custom AI solutions. Optimize AI models for real-time performance and scalability. Collaborate with frontend developers to integrate AI-powered functionalities into user-facing applications. Work with UI/UX teams to ensure smooth deployment of AI-powered solutions on web and mobile platforms. Ensure seamless data flow between the backend AI systems and frontend interfaces. Assist in frontend deployment activities using technologies such as React, Angular, or Vue.js. Research and experiment with new AI techniques and models to stay at the forefront of the field. Optimize deployment pipelines and automate workflows. Monitor AI agent performance, conduct evaluations, and implement improvements. Write and maintain detailed documentation of models, code, and deployment processes. Present findings and updates to both technical and non-technical stakeholders. Required Qualifications: • A Master’s or Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or a related field. • Strong experience with large language models (LLMs) such as GPT, BERT, T5, etc. • Strong proficiency in Python and experience with machine learning frameworks like TensorFlow, PyTorch, Hugging Face, or similar. • Expertise in developing and fine-tuning machine learning models, particularly NLP models. • Experience deploying machine learning models at scale using tools like Docker, Kubernetes, or cloud services (AWS, GCP, Azure). • Familiarity with model optimization techniques (e.g., pruning, quantization, distillation). • Strong frontend development experience with technologies such as React, Angular, or Vue.js. • Experience integrating APIs and connecting frontend interfaces with backend AI systems. • Strong understanding of web deployment processes and cloud-based infrastructures. • Excellent problem-solving skills, with the ability to troubleshoot both frontend and backend issues. Preferred Qualifications : • A proven track record of publications in high-impact international conferences (e.g., NeurIPS, ICML, ACL) or journals. • Experience with reinforcement learning or multi-agent systems. • Familiarity with research methodologies and the ability to contribute to advancing the state-of-the-art in AI. • Experience with AI model interpretability and explainability. • Contributions to open-source AI projects. • Familiarity with containerization and orchestration technologies (Docker, Kubernetes) for both frontend and backend deployments. Why Join Us? • Be part of an innovative and forward-thinking team that is redefining the future of AI. • Work on exciting, high-impact AI projects in an onsite, collaborative environment in Rewa, Madhya Pradesh. • Opportunities for career growth and continuous learning in a fast-paced, dynamic field. • Competitive salary and benefits package. • Flexible work hours and the chance to work with cutting-edge technologies. How to Apply: Please send your resume, cover letter, and any relevant publication links to info@signetarms.co. We are excited to learn more about how you can contribute to Signet Arms’ vision!

Posted 1 week ago

Apply

3.0 - 5.0 years

0 Lacs

new delhi, delhi, india

On-site

Role Overview We need a passionate engineer to design, develop, and deploy advanced deep learning models and computer vision systems. You'll work in a fast-paced startup environment, handling multiple projects and making significant impact on cutting-edge AI solutions. Key Responsibilities Develop deep learning models for image classification, object detection, segmentation, and tracking Build computer vision pipelines for real-time applications and automation systems Research and implement cutting-edge AI techniques from research papers to production Deploy and optimize models in cloud environments (AWS, GCP, Azure) Work with large datasets and implement data augmentation techniques Collaborate with cross-functional teams to integrate AI solutions into products Monitor and maintain deployed models for optimal performance Mentor junior developers and contribute to team growth Required Skills 3-5 years experience in deep learning/computer vision Expert Python proficiency with TensorFlow, PyTorch, Keras Strong knowledge of OpenCV, computer vision algorithms Experience with neural architectures (CNNs, YOLO, ResNet, Transformers) Cloud deployment experience (AWS, GCP, Azure) MLOps knowledge (Docker, Git, CI/CD pipelines) Production model deployment experience Bachelor's/Master's degree in Computer Science, Engineering, or related field Preferred Qualifications Experience with ROS, robotics applications Knowledge of model optimization (quantization, pruning) Edge deployment experience (TensorFlow Lite, ONNX) Startup environment experience Open-source contributions or published research What We Offer Professional development budget and conference attendance Cutting-edge technology access and growth opportunities Direct impact on product development and company direction How to Apply To apply for the Deep Learning & Computer Vision Engineer position, please fill out the application form using the link below. Make sure to provide accurate details, upload your updated resume, and include links to any past projects or portfolios. Apply here: https://forms.gle/J5bXUVWLBXKGjty7A

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies