Jobs
Interviews

4 Onnx Runtime Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

4.0 - 8.0 years

0 Lacs

pune, maharashtra

On-site

You have a total of 8+ years of experience, with at least 4 years in AI, ML, and Gen AI technologies. You have successfully led and expanded AI/ML teams and projects. Your expertise includes a deep understanding and practical experience in AI, ML, Deep Learning, and Generative AI concepts. You are proficient in ML frameworks like PyTorch and/or TensorFlow and have worked with ONNX runtime, model optimization, and hyperparameter tuning. You possess solid experience in DevOps, SDLC, CI/CD, and MLOps practices, with a tech stack that includes Docker, Kubernetes, Jenkins, Git, CI/CD, RabbitMQ, Kafka, Spark, Terraform, Ansible, Prometheus, Grafana, and ELK stack. You have deployed AI models at an enterprise scale and are skilled in data preprocessing, feature engineering, and handling large-scale data. Your expertise extends to image and video processing, object detection, image segmentation, and other computer vision tasks. In addition, you have proficiency in text analysis, sentiment analysis, language modeling, and various NLP applications. You also have experience in speech recognition, audio classification, and signal processing techniques. Your knowledge includes RAG, VectorDB, GraphDB, and Knowledge Graphs. You have extensive experience working with major cloud platforms such as AWS, Azure, and GCP for AI/ML deployments, and integrating cloud-based AI services and tools like AWS SageMaker, Azure ML, and Google Cloud AI. As for soft skills, you exhibit strong leadership and team management abilities, excellent verbal and written communication skills, strategic thinking, problem-solving capabilities, adaptability to the evolving AI/ML landscape, collaboration skills, and the capacity to translate market requirements into technological solutions. Moreover, you have a deep understanding of industry dynamics and a demonstrated ability to foster innovation and creative problem-solving within a team.,

Posted 1 week ago

Apply

10.0 - 14.0 years

0 Lacs

pune, maharashtra

On-site

It is exciting to be a part of a company where individuals truly BELIEVE in the mission at hand! The commitment is to infuse passion and customer-centricity into every aspect of the business. Fractal, a key player in the Artificial Intelligence domain, is dedicated to empowering human decisions within enterprises by merging AI, engineering, and design to support Fortune 500 companies globally. With over 3,000 employees spread across 16 international locations, including the United States, UK, Ukraine, India, Singapore, and Australia, Fractal has consistently been recognized for its excellence. Notable achievements include being rated as one of India's best companies to work for by The Great Place to Work Institute, acknowledged as a leader in Customer Analytics Service Providers Wave 2021, Computer Vision Consultancies Wave 2020, and Specialized Insights Service Providers Wave 2020 by Forrester Research, and identified as an "Honorable Vendor" in Gartner's 2021 Magic Quadrant for data & analytics. The team at Fractal is currently in search of an innovative and motivated Solutions Architect to join the expanding Edge & IoT team. In this role, you will collaborate closely with clients and internal engineering teams to conceptualize and implement efficient edge computing and IoT solutions. Leveraging our expertise in AI/ML and strategic partnerships with industry leaders like Microsoft, Intel, and Nvidia, you will showcase a high level of technical proficiency, comprehend client needs effectively, and deliver cutting-edge, top-quality solutions. To excel in this position, having a robust technical foundation in AI/ML and Edge Computing is essential, along with exceptional communication skills, the ability to think critically and creatively, and the capacity to adapt swiftly to new challenges. A successful Edge AI Solutions Architect will possess a deep understanding of both technological and business requirements, translating complex concepts into user-friendly solutions that drive client success and satisfaction. If you are prepared to contribute to a dynamic and progressive organization, we invite you to apply for this role. Key Responsibilities: - Collaborate with clients to collect requirements, grasp business objectives, and translate them into technical specifications for the engineering team. - Design innovative Edge & IoT solutions, integrating AI/ML components and utilizing cloud platforms like Azure, AWS, and GCP. - Contribute to and lead the development of Edge & IoT accelerators and PoVs. - Design, select, and deploy various IoT hardware components, considering factors such as power consumption, connectivity, interoperability, and environmental constraints. - Develop and present solution architecture and design documents, including diagrams, data flowcharts, and system requirements. - Work with engineering teams to ensure proper implementation of designed solutions and offer technical guidance throughout the project lifecycle. - Act as a mentor for team members, providing guidance, sharing knowledge, and fostering a collaborative and supportive learning environment within the Edge & IoT team. - Serve as a subject matter expert in Edge & IoT technologies, staying updated on industry trends, emerging technologies, and best practices. - Support the account team in the pre-sales process by providing technical assistance and insights into solution proposals. - Contribute to thought leadership initiatives and actively participate in internal trainings and knowledge-sharing sessions. Technical Skills Required: - Proficiency in IoT protocols like MQTT, CoAP, and AMQP. - Knowledge of edge computing technologies and platforms such as Azure IoT Edge, AWS Greengrass, or Google Edge IoT Core. - Understanding of AI/ML algorithms and frameworks, with experience in implementing them in Edge & IoT scenarios. - Experience with IoT devices, sensors, and connectivity solutions, along with cloud-based IoT platforms like Azure IoT Hub, AWS IoT Core, and Google Cloud IoT. - Proficiency in programming languages such as Python, C#, C/C++, or JavaScript. - Familiarity with containerization technologies (e.g., Docker) and orchestration platforms (e.g., Kubernetes, K3S). - Strong understanding of networking concepts, protocols, and technologies, as well as experience in designing secure, reliable, and scalable network architectures for IoT and edge computing solutions. - Knowledge of various hardware platforms such as Raspberry Pi, Arduino, NVIDIA Jetson, Intel NUC, and the ability to choose the appropriate platform based on specific requirements and constraints. - Expertise in packaging and deploying AI and ML models on edge devices using common AI frameworks like TensorFlow, PyTorch, or ONNX Runtime, and optimizing these models for efficient execution at the edge. - Design experience in IoT and edge computing solutions focusing on physical security, device management, thermal management, and enclosure design. - Ability to assess compatibility, interoperability, and performance of hardware components and integrate them into a structured solution architecture. - Understanding of industry standards, regulations, and best practices related to IoT hardware selection, deployment, and security. Qualifications: - Bachelor's or master's degree in computer science, Engineering, or related field. - Minimum 10-12 years of experience in designing and implementing IoT and edge computing solutions, with at least 4-5 years specializing in Edge AI and/or AI/ML integration. - Strong communication and presentation skills. - Excellent problem-solving abilities and attention to detail. - Capacity to work independently or as part of a team. If you thrive in a dynamic environment and enjoy collaborating with enthusiastic, high-achieving colleagues, you will find fulfillment in your career at Fractal!,

Posted 2 weeks ago

Apply

12.0 - 20.0 years

12 - 20 Lacs

Bengaluru, Karnataka, India

On-site

We are looking for a Principal AI/ML Engineer with expertise in model inference, optimization, debugging, and hardware acceleration . This role will focus on building efficient AI inference systems, debugging deep learning models, optimizing AI workloads for low latency, and accelerating deployment across diverse hardware platforms. In addition to hands-on engineering, this role involves cutting-edge research in efficient deep learning, model compression, quantization, and AI hardware-aware optimization techniques . You will explore and implement state-of-the-art AI acceleration methods while collaborating with researchers, industry experts, and open-source communities to push the boundaries of AI performance. This is an exciting opportunity for someone passionate about both applied AI development and AI research , with a strong focus on real-world deployment, model interpretability, and high-performance inference. Education & Experience: 20+ years of experience in AI/ML development, with at least 5 years in model inference, optimization, debugging, and Python-based AI deployment. Masters or Ph.D. in Computer Science, Machine Learning, AI. Leadership & Collaboration: Lead a team of AI engineers in Python-based AI inference development. Collaborate with ML researchers, software engineers, and DevOps teams to deploy optimized AI solutions. Define and enforce best practices for debugging and optimizing AI models. Key Responsibilities: Model Optimization & Quantization: Optimize deep learning models using quantization (INT8, INT4, mixed precision etc), pruning, and knowledge distillation. Implement Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) for deployment. Familiarity with TensorRT, ONNX Runtime, OpenVINO, TVM. AI Hardware Acceleration & Deployment: Optimize AI workloads for Qualcomm Hexagon DSP, GPUs (CUDA, Tensor Cores), TPUs, NPUs, FPGAs, Habana Gaudi, Apple Neural Engine. Leverage Python APIs for hardware-specific acceleration, including cuDNN, XLA, MLIR. Benchmark models on AI hardware architectures and debug performance issues. AI Research & Innovation: Conduct state-of-the-art research on AI inference efficiency, model compression, low-bit precision, sparse computing, and algorithmic acceleration. Explore new deep learning architectures (Sparse Transformers, Mixture of Experts, Flash Attention) for better inference performance. Contribute to open-source AI projects and publish findings in top-tier ML conferences (NeurIPS, ICML, CVPR). Collaborate with hardware vendors and AI research teams to optimize deep learning models for next-gen AI accelerators. Details of Expertise: Experience optimizing LLMs, LVMs, LMMs for inference. Experience with deep learning frameworks: TensorFlow, PyTorch, JAX, ONNX. Advanced skills in model quantization, pruning, and compression. Proficiency in CUDA programming and Python GPU acceleration using cuPy, Numba, and TensorRT. Hands-on experience with ML inference runtimes (TensorRT, TVM, ONNX Runtime, OpenVINO). Experience working with RunTimes Delegates (TFLite, ONNX, Qualcomm). Strong expertise in Python programming, writing optimized and scalable AI code. Experience with debugging AI models, including examining computation graphs using Netron Viewer, TensorBoard, and ONNX Runtime Debugger. Strong debugging skills using profiling tools (PyTorch Profiler, TensorFlow Profiler, cProfile, Nsight Systems, perf, Py-Spy). Expertise in cloud-based AI inference (AWS Inferentia, Azure ML, GCP AI Platform, Habana Gaudi). Knowledge of hardware-aware optimizations (oneDNN, XLA, cuDNN, ROCm, MLIR, SparseML). Contributions to open-source community. Publications in International forums conferences journals.

Posted 1 month ago

Apply

8.0 - 13.0 years

10 - 14 Lacs

Bengaluru

Work from Office

General Summary: As a leading technology innovator, Qualcomm pushes the boundaries of what's possible to enable next-generation experiences and drives digital transformation to help create a smarter, connected future for all. As a Qualcomm Systems Engineer, you will research, design, develop, simulate, and/or validate systems-level software, hardware, architecture, algorithms, and solutions that enables the development of cutting-edge technology. Qualcomm Systems Engineers collaborate across functional teams to meet and exceed system-level requirements and standards. Minimum Qualifications: Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 8+ years of Systems Engineering or related work experience. OR Master's degree in Engineering, Information Systems, Computer Science, or related field and 7+ years of Systems Engineering or related work experience. OR PhD in Engineering, Information Systems, Computer Science, or related field and 6+ years of Systems Engineering or related work experience. Principal Engineer Machine Learning We are looking for a Principal AI/ML Engineer with expertise in model inference , optimization , debugging , and hardware acceleration . This role will focus on building efficient AI inference systems, debugging deep learning models, optimizing AI workloads for low latency, and accelerating deployment across diverse hardware platforms. In addition to hands-on engineering, this role involves cutting-edge research in efficient deep learning, model compression, quantization, and AI hardware-aware optimization techniques . You will explore and implement state-of-the-art AI acceleration methods while collaborating with researchers, industry experts, and open-source communities to push the boundaries of AI performance. This is an exciting opportunity for someone passionate about both applied AI development and AI research , with a strong focus on real-world deployment, model interpretability, and high-performance inference . Education & Experience: 20+ years of experience in AI/ML development, with at least 5 years in model inference, optimization, debugging, and Python-based AI deployment. Masters or Ph.D. in Computer Science, Machine Learning, AI Leadership & Collaboration Lead a team of AI engineers in Python-based AI inference development . Collaborate with ML researchers, software engineers, and DevOps teams to deploy optimized AI solutions. Define and enforce best practices for debugging and optimizing AI models Key Responsibilities Model Optimization & Quantization Optimize deep learning models using quantization (INT8, INT4, mixed precision etc), pruning, and knowledge distillation . Implement Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) for deployment. Familiarity with TensorRT, ONNX Runtime, OpenVINO, TVM AI Hardware Acceleration & Deployment Optimize AI workloads for Qualcomm Hexagon DSP, GPUs (CUDA, Tensor Cores), TPUs, NPUs, FPGAs, Habana Gaudi, Apple Neural Engine . Leverage Python APIs for hardware-specific acceleration , including cuDNN, XLA, MLIR . Benchmark models on AI hardware architectures and debug performance issues AI Research & Innovation Conduct state-of-the-art research on AI inference efficiency, model compression, low-bit precision, sparse computing, and algorithmic acceleration . Explore new deep learning architectures (Sparse Transformers, Mixture of Experts, Flash Attention) for better inference performance . Contribute to open-source AI projects and publish findings in top-tier ML conferences (NeurIPS, ICML, CVPR). Collaborate with hardware vendors and AI research teams to optimize deep learning models for next-gen AI accelerators. Details of Expertise: Experience optimizing LLMs, LVMs, LMMs for inference Experience with deep learning frameworks : TensorFlow, PyTorch, JAX, ONNX. Advanced skills in model quantization, pruning, and compression . Proficiency in CUDA programming and Python GPU acceleration using cuPy, Numba, and TensorRT . Hands-on experience with ML inference runtimes (TensorRT, TVM, ONNX Runtime, OpenVINO) Experience working with RunTimes Delegates (TFLite, ONNX, Qualcomm) Strong expertise in Python programming , writing optimized and scalable AI code. Experience with debugging AI models , including examining computation graphs using Netron Viewer, TensorBoard, and ONNX Runtime Debugger . Strong debugging skills using profiling tools (PyTorch Profiler, TensorFlow Profiler, cProfile, Nsight Systems, perf, Py-Spy) . Expertise in cloud-based AI inference (AWS Inferentia, Azure ML, GCP AI Platform, Habana Gaudi). Knowledge of hardware-aware optimizations (oneDNN, XLA, cuDNN, ROCm, MLIR, SparseML). Contributions to open-source community Publications in International forums conferences journals

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies