Home
Jobs
Companies
Resume

35 Gpu Jobs

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 10.0 years

7 - 12 Lacs

Bengaluru

Work from Office

Naukri logo

Embedded / 4+ Years This job might be for you if You enjoy solving problems. You love taking on difficult challenges and finding creative solutions. You dont know the answer but will dig until you find it. You communicate clearly. You write well. You are motivated and driven. You volunteer for new challenges without waiting to be asked. You will take ownership of the time you spend with us and make a difference. You can impress our customers with your enthusiasm to solve their issues (and solve them!) Responsibilities Develop and integrate real-time graphics pipelines for Android using OpenGL/Vulkan, targeting efficient rendering on Qualcomm GPUs. Design and implement compute shaders for AR tasks like 3D object processing and visual effects, leveraging Qualcomms graphics processing capabilities. Optimize graphics performance for smooth AR experiences on Android devices, collaborating with the team to ensure a responsive user experience. Stay current on graphics APIs, hardware acceleration features, and the latest advancements in Qualcomms Snapdragon chipsets. Contribute to Linux device driver development to ensure optimal graphics driver support for Qualcomm SoCs in AR applications. Qualifications 5 to 10 years of experience in Linux Device driver development AND Android Graphics Proven experience in Linux kernel development, with expertise in writing device drivers for various hardware components. Strong understanding of Linux kernel internals, including memory management, I/O subsystems, and device models. Experience with Android graphics frameworks (e.g., OpenGL ES, Vulkan) and graphics optimization techniques. Familiarity with hardware-accelerated graphics architectures and GPU programming. High Proficiency in C/C++ programming languages and scripting languages such as Python Education: Bachelors/Masters degree in Computer Science, Electronics and Communications or related fields. Note: The selected candidate will work out of Eximietas India center for about 8-12 months but should be willing to relocate to Eximietas US on a long-term work assignment.

Posted 1 week ago

Apply

4.0 - 9.0 years

0 - 2 Lacs

Bengaluru

Work from Office

Naukri logo

Role & responsibilities Job Overview : As a Linux System Debug & Validation Engineer, you will be responsible for validating GPU driver software on various platforms, including emulators and hardware boards. You will work closely with cross-functional teams to assess the functionality and performance of GPU drivers, ensuring they meet the highest quality standards. The ideal candidate is passionate about validation, possesses a solid understanding of the GPU software stack, and thrives in a collaborative environment. Preferred Skills : Excellent C coding. Experience with System debug or validation in a Linux environment. Familiarity with testing frameworks and automation tools for driver validation. Key Responsibilities : Validate Linux GPU device driver by setting up and executing comprehensive test plans on both emulators and physical hardware platforms. Build and install Linux kernels while ensuring compatibility with various GPU drivers and configurations. Collaborate with software development teams to understand GPU software stack basics and ensure seamless integration with existing systems. Collect and analyze kernel and application logs to identify and troubleshoot driver issues, providing detailed reports and recommendations for improvements. Utilize tools such as Git and Makefile for version control and building driver modules efficiently. Employ debugging tools to diagnose issues within the GPU stack and provide insights into performance optimizations. Collaborate with cross-functional teams to address software validation requirements and improve overall driver quality. Maintain up-to-date documentation of test processes, results, and software changes. Qualifications : Bachelors degree in computer science, Electrical Engineering, or a related field, or equivalent work experience. Solid understanding of Linux development environments, including kernel building and installation processes. Basic knowledge of GPU software stack components and their interactions. Experience with log collection and analysis for kernel and application troubleshooting. Proficiency in using version control systems like Git and familiarity with Makefiles. Strong analytical and problem-solving skills, with a detail-oriented mindset. Excellent communication skills, both verbal and written, with a collaborative approach to teamwork. A genuine passion for software validation, hardware interactions, and technology innovation.

Posted 1 week ago

Apply

2.0 - 4.0 years

3 - 7 Lacs

Pune

Work from Office

Naukri logo

Essential Skills: Machine Learning & Deep Learning: Solid understanding of ML concepts and hands-on experience with deep learning (especially neural networks and Transformers). Python Programming: Strong Python coding skills with familiarity in frameworks like TensorFlow, PyTorch, or Keras. Natural Language Processing (NLP): Experience with tokenization, embeddings, and language model fine-tuning. Data Engineering: Ability to clean, process, and manage large datasets with a focus on data quality. Math & Statistics: Good knowledge of linear algebra, calculus, probability, and statistics. Preferred Additional Skills: Model Deployment (MLOps): Exposure to deploying models using Docker, APIs, CI/CD, and tools like MLflow or Hugging Face Hub. RAG & LLM Integration: Hands-on with FAISS, LangChain, embedding models, and large language models. Prompt Engineering: Skilled in designing prompts and evaluating AI output quality. System Scalability: Understanding of GPU optimization and AI system performance.

Posted 1 week ago

Apply

7.0 - 12.0 years

3 - 7 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

Employment Type : Full Time, Permanent Working mode : Regular Notice Period : Immediate - 15 Days Key Responsibilities : Linux BSP Development : Develop, port, and maintain Linux BSP for target devices. Device Driver Porting : Port device drivers for USB, I2C, and other peripherals. GPU Integration : Integrate GPU capabilities like OpenGL, CL, Vulkan, video acceleration, and display. Root Cause Analysis : Conduct in-depth root cause analysis for issues related to Linux BSP, device drivers, and GPU. Embedded Linux and RTOS : Work with embedded Linux and RTOS environments. Performance Optimization : Optimize system performance and resource utilization. Collaboration : Collaborate with cross-functional teams to ensure seamless integration.Documentation : Create clear and concise technical documentation. Required Skills and Experience : Strong proficiency in Linux kernel development and device driver programming. In-depth understanding of Linux kernel architecture and subsystems. Experience with device driver development for USB, I2C, and other peripherals. Knowledge of GPU architectures and APIs (OpenGL, CL, Vulkan). Experience with embedded Linux and RTOS. Strong problem solving and debugging skills. Excellent communication and collaboration skills. Proficiency in scripting languages (e.g., Python, Bash).

Posted 2 weeks ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

Noida

Work from Office

Naukri logo

About The Role Were building an agentic AI platform that turns one line of text and a video feed into end-to-end, real-time computer-vision solutionsthink semantic video search, object / action recognition, and task-oriented visual agents deployable with a single click As a Gen AI ML Engineer, youll architect the core vision & multimodal-reasoning stack and pave the road from prototype to production. Roles And Responsibilities Semantic video search Ship a pipeline that allows users to type show every forklift near aisle 5 in the last 30 minutes and get keyed-off clips in Wire embeddings to a hybrid FAISS/HNSW index; surface results through a simple REST & React playground. Create agentic pipelines Chain vision language models and zero/few-shot vision models with LLM planners (Gemini, GPT-4o, AutoGen, etc.) so a single prompt becomes a multi-step perception workflow. Profile and accelerate inference (TensorRT, ONNX, quantization, batching) to meet latency / throughput targets on GPU and CPU fleets. Rapid prototyping loops Run weekly paper-to-prototype spikes: reproduce a fresh arXiv idea, benchmark, and decide go/no-go in Hand successful python scripts & checkpoints to MLOps for productionizationno plumbing marathons. Data & Evaluation Spin up scalable pipelines for video ingestion, labeling (active learning, weak supervision), experiment tracking, and continuous evaluation. Collaborate & Lead Partner with product and ML Ops engineers; set research direction, mentor future hires, and establish best practices. Must-have Skill Set 13 years deep-learning research experience (internships & grad work count). Fluency in Python + PyTorch; comfortable hacking large vision/LLM repos. Proof you ship ideasfirst-author paper, OSS repo, Kaggle medal, or faithful reproduction of a cutting-edge model. Hands-on with LLM prompting/fine-tuning and at least one agent framework. Able to turn fuzzy product asks into measurable experiments and explain results clearly. Bonus Cred Large-scale video retrieval or temporal grounding experience. Prior work building agentic-AI pipelines that combine perception models with LLM reasoning. Open-source contributions to GenAI/vision libs (OpenCLIP, Vid2Seq, ViperGPT, etc.). What can you expect? Ability to shape the future of manufacturing by leveraging best-in-class AI and software; we are a unique organization with niche skill set that you would also develop while working with us World class work culture, coaching and development Mentoring from highly experienced leadership from world class companies (refer to Ripik.AI website for details) International exposure Work Location NOIDA (Work from Office)

Posted 2 weeks ago

Apply

4.0 - 8.0 years

10 - 20 Lacs

Noida

Work from Office

Naukri logo

Candidate will be working on overall system areas component development with key domains like (Operating System, Kernel, Compiler, Toolchain, Emulator, Memory, open source) Skills Required: • Embedded platform development (Linux based distributions) & maintenance • Linux build system expertise, packaging (RPM specification), build automation tools • System Software (File system, Memory Management, Thread programming, Process Management, Platform Driver development/Integration, IPC, logging) • Understanding about Compiler, toolchain development • Different CPU Scheduling concept (SMP, HMP, FIFO, RT) • System Level IPC expertise (User space, applications, middleware, kernel), logging framework • System Level+B9 policies (memory, scheduler, CPU, hotplug) • Platform porting & Reference Platform development on multiple chip-set • Defect root-cause analysis & debugging utils (A9:G9gdb, objdump, Backtrace/Coredump analysis etc.) • Performance/resource benchmarking & optimization (system boot-up/application launching/memory utilization etc.) - System Level Profiling, Tracing • Full Software stack verification & Release management • Platform Certification & Compliance Understanding Preferred Skills: • Programming language - C, C++, Scripting (Bash, python) • Development environment - Linux • Development, tracking code management - JIRA, GIT, Swarm, Gerrit, Perforce • Good understanding of OS, multi-threading, IPC & Cross-compilation Concepts

Posted 3 weeks ago

Apply

5.0 - 10.0 years

75 - 125 Lacs

Hyderabad

Hybrid

Naukri logo

Staff IP/RTL Design Engineer for TPU Hyderabad Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore IP/RTL Design Engineer for TPU Position Overview Seeking an IP/RTL Design Engineer with 5+ years of experience to design IP/RTL for TPUs, focusing on high-performance matrix multiplication, low-latency interconnects, and power-efficient AI acceleration. Key Responsibilities Design IP blocks for TPU cores, including systolic arrays, vector units, and memory subsystems. Develop Verilog/SystemVerilog RTL for performance, timing, and area optimization. Implement high-speed interconnects (e.g., AXI, NoC) for TPU data pipelines. Optimize designs for high throughput, low latency, and power efficiency in AI workloads. Integrate LPDDR6, HBM3, DDR5, or chiplet-based memory interfaces. Support synthesis, timing closure, and FPGA prototyping and Design Verification team Document microarchitecture and design specifications. Required Qualifications Education: BS/MS in Electrical/Computer Engineering. Experience: 5-10+ years in ASIC/FPGA IP/RTL design, with 3+ years in AI accelerators or TPU-like architectures. Skills: Proficient in Verilog/SystemVerilog RTL design. Knowledge of TPU architectures, systolic arrays, or matrix multiplication units. Experience with AXI, NoC, or similar interconnect protocols. Familiarity with LPDDR6, HBM3, DDR5, or high-bandwidth memory interfaces. Proficiency with synthesis and timing tools (e.g., Synopsys Design Compiler). Strong problem-solving and teamwork skills. Preferred Qualifications Experience with AI/ML workloads or datacenter TPU designs and GPU architectures Knowledge of CXL, PCIe, UALink, or Ultra Ethernet. Familiarity with power optimization for high-performance chips. What is in it for you? Pure play product work environment Chance to work with a tightly knit group of exceptional engineers who come from the top companies of the Semiconductor world Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. A meritocracy first work place where each peer is a star A chance to be a part of industry shaping product in entirety (not bits and pieces) from initial stages A chance to work at a startup which already has customers and investor lined up for their product pipeline (We do not have a marketing/sales team, because we do not need them). A chance to learn from industry veterans who have already launched multiple Billion Dollar Semiconductor firms over the last 3 decades. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"

Posted 3 weeks ago

Apply

10.0 - 20.0 years

80 - 150 Lacs

Hyderabad

Hybrid

Naukri logo

Principal IP/RTL Design Engineer for TPU Bangalore / Hyderabad Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore IP/RTL Design Engineer for TPU Position Overview Seeking an IP/RTL Design Engineer with 5+ years of experience to design IP/RTL for TPUs, focusing on high-performance matrix multiplication, low-latency interconnects, and power-efficient AI acceleration. Key Responsibilities Design IP blocks for TPU cores, including systolic arrays, vector units, and memory subsystems. Develop Verilog/SystemVerilog RTL for performance, timing, and area optimization. Implement high-speed interconnects (e.g., AXI, NoC) for TPU data pipelines. Optimize designs for high throughput, low latency, and power efficiency in AI workloads. Integrate LPDDR6, HBM3, DDR5, or chiplet-based memory interfaces. Support synthesis, timing closure, and FPGA prototyping and Design Verification team Document microarchitecture and design specifications. Required Qualifications Education: BS/MS in Electrical/Computer Engineering. Experience: 5-10+ years in ASIC/FPGA IP/RTL design, with 3+ years in AI accelerators or TPU-like architectures. Skills: Proficient in Verilog/SystemVerilog RTL design. Knowledge of TPU architectures, systolic arrays, or matrix multiplication units. Experience with AXI, NoC, or similar interconnect protocols. Familiarity with LPDDR6, HBM3, DDR5, or high-bandwidth memory interfaces. Proficiency with synthesis and timing tools (e.g., Synopsys Design Compiler). Strong problem-solving and teamwork skills. Preferred Qualifications Experience with AI/ML workloads or datacenter TPU designs and GPU architectures Knowledge of CXL, PCIe, UALink, or Ultra Ethernet. Familiarity with power optimization for high-performance chips. What is in it for you? Pure play product work environment Chance to work with a tightly knit group of exceptional engineers who come from the top companies of the Semiconductor world Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. A meritocracy first work place where each peer is a star A chance to be a part of industry shaping product in entirety (not bits and pieces) from initial stages A chance to work at a startup which already has customers and investor lined up for their product pipeline (We do not have a marketing/sales team, because we do not need them). A chance to learn from industry veterans who have already launched multiple Billion Dollar Semiconductor firms over the last 3 decades. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"

Posted 3 weeks ago

Apply

10.0 - 16.0 years

35 - 75 Lacs

Hyderabad

Hybrid

Naukri logo

SpinSci Technologies builds the AI-powered patient-engagement cloud that links contact-center platforms (NICE CXone, Avaya, Amazon Connect) to EHRs like Epic and Oracle Health. Security, uptime, data, and developer velocity are mission-critical, and thats where you come in. What You’ll Do Own DevOps & Platform Engineering • Build CI/CD pipelines, Terraform IaC • Multi-cloud Kubernetes (OCI & AWS) with blue/green + canary releases Enable Developer Velocity • Eliminate SDLC friction with self-service environments, golden paths, and internal tooling • Drive DORA metrics improvements and publish productivity dashboards Raise the Reliability Bar • Set SLOs, champion observability (Datadog, Prometheus), and manage on-call Lead SecOps • Automate vuln management & incident response • Sustain HIPAA, SOC 2 Type II, and PCI DSS compliance under zero-trust design Run IT & Identity • Okta-centric access, JAMF/Intune device fleets, SaaS stack governance Own Data Engineering & Reporting • Build and operate ELT pipelines and a cloud data warehouse (Snowflake / Redshift / BigQuery) • Deliver company-wide BI dashboards for product, ops, and finance insights Set Strategy & Grow Talent • Publish a 24-month infra + data roadmap (GPU/AI capacity, cost modeling) • Recruit, mentor, and retain a 10-15-person DevOps/SecOps/Data/IT team What Sets You Up for Success 8+ yrs in DevOps / Platform roles, 3+ yrs leading teams in regulated SaaS or healthcare Kubernetes mastery on OCI, AWS, and on-prem vSphere; GitOps + IaC (Terraform) Deep knowledge of HIPAA, SOC 2, PCI, and zero-trust architectures Proven track record in cloud data engineering and BI visualization tooling Hands-on observability (Datadog, Prometheus, OpenTelemetry) for high-volume APIs Partner-of-choice for Product & Engineering; you translate roadmap needs into rock-solid infra and data foundations Bonus Points: GPU/AI workload scaling, CCaaS integrations, Epic/Oracle Health or HL7/FHIR chops, HITRUST exposure. Why SpinSci? Impact: Your work powers every patient call, self-schedule, analytics insight, and secure payment. Growth: Fast-scaling health-tech SaaS migrating from AWS to Oracle Cloud while expanding an AI roadmap. Culture: High-ownership, healthcare-mission-driven, backed by PE investment

Posted 3 weeks ago

Apply

2.0 - 5.0 years

8 - 14 Lacs

Hyderabad

Work from Office

Naukri logo

- 2+ years of overall experience with large portion of that working on C++ based projects- Hands on implementation of algorithms in Cuda, Shaders on GPU.- Experience in ARM architecture- Very good Knowledge on Object-Oriented Design & System Integration- Very good knowledge on Code Optimization, Implement & Adapt Complex Algorithms - As a member of the team, you will play a critical role in all stages of GPU development- Design and architect features in compute and graphics stimulus development framework similar to OpenGL and CUDA- Strong C++ programming capability required - Graphics or CUDA knowledge a plus- Experience with OpenGL, Vulkan, Direct3D, CUDA APIs a plus Skills: Candidates should have a B.E. or B.Tech. degree in Computer Science, Information Technology or related subjects within the past 5 years.

Posted 3 weeks ago

Apply

5.0 - 10.0 years

8 - 14 Lacs

Hyderabad

Work from Office

Naukri logo

What we do : Our game-changing AI solutions revolutionize what people and businesses can achieve. Ara inference processors combined with our SDK deliver unrivaled deep learning performance at the edge to accelerate and optimize real-time decision making where every millisecond is critical, and power efficiency is a must. solutions embed high-performance AI into edge devices to create a smarter, safer, and more enjoyable world. Edge AI is on the brink of a boom, and looking forward to playing a significant role in it. This is what you are responsible for : - Will be part of a highly talented global team developing neural network systems. - Will have the opportunity to develop the application and system software for a cutting-edge AI silicon. - Member of a software engineering team, you will design and implement compilers, translators for AI computing system - Work closely with ASIC and hardware teams, propose architectural features and develop software for the system and build the highest perf/watt system Necessary Qualifications : - Excellent knowledge of Python/C/C++, data structures and algorithms. - Experience in neural network frameworks like TensorFlow, ONNX, Caffe, PyTorch etc. is a plus - BTech/MTech with 4+ years of experience in software development

Posted 3 weeks ago

Apply

20 - 27 years

90 - 150 Lacs

Hyderabad

Work from Office

Naukri logo

KEY EXPERTISE Seasoned ASIC Front End leader with 20 years of cross domain experience ranging from architecture, uArch, IP/Sub- systems/SOC/ chiplets design/integration, RTL coding, Synthesis, CDC, timing, power analysis, system/IP verification, Silicon Bring up. Proven track record of leading the design and development of complex IPs, sub-systems, chiplets for SOCs in the multiple domains like PCIE, USB, UCIE, ARM/x86 CPUs, RISC-V, VPU/NPU, GPU, LSIO, NOC, Fabrics, AMBA buses, DRAM, SD/SDIO/eMMC etc. Responsible for defining the technical direction of ASIC designs and collaborating with cross- functional teams to ensure successful ASIC implementation. Demonstrated strong leadership, project timelines & resources management and team management skills, and the ability to influence the technical strategy of the organization. Familiar with ASIC verification methodologies, DFT, Physical design and board design which help in influencing cross functional teams in getting desired results. Excellent execution capabilities to handle multiple domains in multiple projects simultaneously. Delivered superior results through team collaboration and diversity of thought. Always open to learn new technologies to grow in technical breadth and depth. Managed development of multiple sub-systems and IPs designed from scratch for Intel IOT (Elkhart Lake), Edge (Reefbay), dVPU/NPU (Arrow LakeR), GPU (DMR-D), Media (MTL-D), Smart NIC (Altera NIC), Palm Ridge, Mount Morgan IPU SoCs which are executed in advanced technology nodes of both Intel (18A, 3nm, 5nm) and TSMC (N3e, N5, N6). Have hands on experience in chiplets, Sub-systems and IP development (micro-architecture development, 3rd party IP integration (Synopsys, Verisilicon. SiFive RISCV, ARM cores etc.,), RTL implementation, synthesis, static timing analysis, Power analysis, system/IP level verification, FPGA emulation, Si bring-up) and SoC integration flows and methodologies. Led 30+ engineer design team and have good experience in working with cross-functional teams and cross BU teams across multiple geos, resulting in good collaboration and accelerated time to market. Led IP development (RTL design, Lint, CDC, Synthesis, timing, unit level and system level verification) of various IPs in Nvdia Tegra SoC processors (from first generation [APX] to ninth generation [Xavier]) and Cisco NIC chips. Have good working experience on low power design methodologies (clock gating, power gating, multi-vt and DVFS) used in mobile SoCs. Designed couple of modules in Tegra SoC like DMA engine, SD/SDIO/ eMMC5.2 host controller and bus-bridges for Nvidia proprietary buses. Worked on architecture, micro architecture, RTL design and timing analysis. Familiar with automotive electronics ISO26262 safety requirements. Was Executive member from Nvidia in SD card org and JEDEC (eMMC) forum. Participated in SD/SDIO4.x, SD host4.x and eMMC5.x specification development. Working experience with cross functional teams like back end, analog I/O pad and SW teams to ensure IP requirements are met at each stage. Have working experience in developing tree build and regression infrastructure. Have hands on experience in ASIC verification also - Test Planning, Develop Directed, Random and System-level (soc level) Test Cases; Design Test Bench using System Verilog; Develop Random Test environment; Execute Code Coverage & Analyse Reports, Execute Gate-level Simulations; Execute Functional & Regression Tests. Good Team Player: Participated and lead the effort of SD4.x/eMMC5.x host controller design and verification. Detail oriented go- getter with Fast Learning Curve and strong analytical, decision making, problem solving, visualizing, negotiating, communication & interpersonal skills. Mentored engineers, designed IP/SS schedules with proper staging plan with cross team dependencies, identified and solved technical issues, and ensured development of high-quality products.

Posted 4 weeks ago

Apply

6 - 8 years

6 - 16 Lacs

Pune

Work from Office

Naukri logo

ML Engineer Position - ML - Engineer Experience - 7+ yrs Client Name - Honeywell International Ltd Payroll Company - Bramha Tech CTC - As per industry norms Job Location - Pune Notice Period - Immediate/ serving Notice Objectives of this role • Design and develop efficient computer vision applications for security and surveillance domain. • Develop computer vision applications and algorithms for deploying on low power embedded devices. • Collaborate with firmware engineers, front-end engineers, QA Engineers and architects on production systems and applications. • Identify differences in data distribution that could potentially affect model performance in real world applications • Ensure algorithms generate accurate predictions. • Stay up to date with developments in the machine learning industry. • Do Data versioning as well as model versioning of the collected data and developed models. Skills and qualifications • Extensive math and computer skills, with a deep understanding of probability, statistics, and algorithms. • Familiarity with deploying deep learning models on low power embedded devices. • Good knowledge of programming with C and C++ is must. • Proven record of working with AI Accelerators, NPU and quantization frameworks like OpenVINO or Neuralmagic. • In-depth knowledge of TF or PyTorch. • Familiarity of ArmNN, Kendryte NNcase, Maix Sipeed or RKNN toolkits. • Good knowledge of version control systems like Git, Azure Repos. • Familiarity with data structures, data modeling, and software architecture. • Impeccable analytical and problem-solving skills Preferred qualifications • Proven experience as a machine learning engineer or similar role • Bachelors degree (or equivalent) in computer science, mathematics, or related field

Posted 1 month ago

Apply

7 - 12 years

20 - 35 Lacs

Coimbatore

Remote

Naukri logo

Job Title: Windows Display Driver (Freelance) Job Type: Freelance About the Role: We are seeking a highly experienced Senior Windows Video Driver Consultant to support our development and debugging of custom video drivers for Windows-based platforms. You will work closely with internal engineering teams to ensure optimal performance, compatibility, and stability of video drivers across various hardware configurations. Key Responsibilities: Develop, debug, and optimize Windows video drivers (WDDM). Provide expert-level consultation on driver architecture and design improvements. Collaborate with hardware teams to ensure proper integration with video subsystems. Analyze logs, dumps, and performance metrics to troubleshoot complex issues. Assist in the certification process (WHQL) and ensure compliance with Microsoft driver standards. Support bring-up of new platforms or features (DirectX, HDR, multi-display, etc.). Create technical documentation and handover materials for internal teams. Required Skills & Qualifications: 4+ years of hands-on experience with Windows driver development. Deep knowledge of WDDM , DirectX , and Windows kernel-mode development. Strong experience with KMDF/UMDF , DXGI , and graphics stack debugging tools . Solid C/C++ programming and debugging skills in a Windows environment. Experience with tools such as WinDbg, GPUView, and ETW. Understanding of video pipeline, display protocols (HDMI, DisplayPort), and hardware interfaces. Experience working with OEMs or IHVs is a strong plus. Ability to work independently and communicate technical ideas effectively.

Posted 1 month ago

Apply

7 - 12 years

20 - 35 Lacs

Noida, Hyderabad, Gurugram

Hybrid

Naukri logo

Software Architect Generative AI & LLM Systems job location Hyderabad - Noida or Gurgaon Job Overview We are seeking a highly experienced and hands-on Software Architect to lead the design and deployment of Large Language Model (LLM)-powered applications across cloud and on-prem environments. This role demands deep expertise in full-stack software development, high-performance inference systems, and cutting-edge generative AI workflows. You will play a key role in scaling AI infrastructure, maximizing throughput, and educating cross-functional teams on best practices for building LLM-driven solutions. Key Responsibilities LLM Deployment & Infrastructure Design: Architect, deploy, and maintain LLMs on cloud-based GPU clusters (e.g., AWS, GCP, Azure) or on-premise hardware including NVIDIA HGX and smaller GPU-accelerated instances. Bonus points for experience deploying containerized LLM applications in GPU clusters. Performance Optimization on Software Layer: Optimize LLM serving stacks using frameworks such as vLLM, TensorRT-LLM, or DeepSpeed to improve inference throughput and reduce time-to-first-token latency. Prompt Engineering & Optimization: Design, test, and refine prompts for LLMs to extract the highest quality output. Mentor team members on prompt engineering strategies and few-shot examples. I nference Efficiency & Scalability: Architect systems to maximize low-latency performance and time-to-first-token even under high demand. GenAI Application Architectu re: Build and lead GenAI application development using Langchain, designing modular pipelines for agents, tools, and memory systems. Define architectural patterns and reusable workflows. Team Enablement & Education: Educate and upskill engineering teams on best practices in GenAI development, inference performance, and prompt design through documentation, workshops, and code reviews. RAG with SQL-based Systems: Design and implement retrieval-augmented generation (RAG) pipelines that leverage SQL-like structured databases for high-relevance grounding. Vector Database Integration (Nice-to-Have): Bonus: Architect and optimize RAG systems using vector embeddings and specialized vector databases such as FAISS, Weaviate, or Pinecone. Requirements Must-Have Skills: 7+ years of full-stack development and software architecture experience Proven track record deploying LLMs in production, both on-premise and cloud GPU environments Strong hands-on experience with v LLM, Langchain, and model serving performance tuning Deep knowledge of prompt engineering, token economy, and optimizing LLM behavior Experience designing and scaling inference pipelines for latency and throughput Strong experience with Python and either TypeScript or Golan g Familiarity with deploying applications to hyperscalers (AWS, GCP, Azure) Strong knowledge of SQL databases and data retrieval strategies for grounding LLM responses Nice-to-Have Skills: Experience with vector databases and embedding-based retrieval in RAG pipelines Experience with orchestrating containerized LLM deployments using Kubernetes or Ray Familiarity with streaming inference systems and token-by-token UX optimizations Background in AI/ML systems, MLOps, or research-to-prod workflows conact 95134 87487

Posted 1 month ago

Apply

5 - 10 years

20 - 35 Lacs

Bengaluru

Work from Office

Naukri logo

Develop and optimize HPC applications and algorithms using CUDA, MPI, OpenMP on Azure and cluster systems. Support scientific teams by modernizing codebases and enabling GPU acceleration. Required Candidate profile Software engineer with 5+ years in HPC programming, scientific code optimization, GPU computing, and collaboration with research teams.

Posted 1 month ago

Apply

2 - 5 years

8 - 14 Lacs

Hyderabad

Work from Office

Naukri logo

- 2+ years of overall experience with large portion of that working on C++ based projects - Hands on implementation of algorithms in Cuda, Shaders on GPU. - Experience in ARM architecture - Very good Knowledge on Object-Oriented Design & System Integration - Very good knowledge on Code Optimization, Implement & Adapt Complex Algorithms - As a member of the team, you will play a critical role in all stages of GPU development - Design and architect features in compute and graphics stimulus development framework similar to OpenGL and CUDA - Strong C++ programming capability required - Graphics or CUDA knowledge a plus - Experience with OpenGL, Vulkan, Direct3D, CUDA APIs a plus Skills: Candidates should have a B.E. or B.Tech. degree in Computer Science, Information Technology or related subjects within the past 5 years.

Posted 2 months ago

Apply

10 - 17 years

18 - 25 Lacs

Hyderabad

Work from Office

Naukri logo

About the Role We are seeking an experienced Software Architect with a strong background in designing and implementing scalable, high-performance systems. As a Software Architect you will play a key role in shaping the technical direction of our products, defining architectural best practices, and collaborating closely with cross-functional teams to deliver state-of-the-art AI-driven solutions. This is what you are responsible for - Lead the architectural design and implementation of scalable, reliable, and high-performance software systems for AI, AI Compiler and edge computing applications. - Collaborate with product managers, software engineers, and hardware engineers to ensure alignment of technical decisions with business objectives. - Define and maintain architectural best practices, guidelines, and documentation for the software engineering team. - Evaluate and recommend technologies, frameworks, and tools to optimize the performance and scalability of our solutions. - Ensure that all software architecture aligns with security, performance, and reliability standards. - Mentor and provide technical leadership to the engineering team, fostering a culture of collaboration and innovation. - Participate in code reviews, design discussions, and technical roadmap planning to ensure high-quality delivery. - Drive continuous improvement in system architecture and development processes to support the company's growth and evolving requirements. Necessary qualifications : - 8+ years of experience in software architecture, system design, and development of scalable, distributed systems. - Proven experience in designing edge-based solutions, compilers, runtime, firmware. - Strong programming skills in modern languages such as Python, C++ or similar language. - Expertise in designing high-performance, low-latency systems for AI/ML workloads. - Strong understanding of software development methodologies, DevOps practices, and CI/CD pipelines. - Familiarity with hardware-software co-design, embedded systems, and edge computing solutions is a plus. - Excellent problem-solving and communication skills, with the ability to explain complex technical concepts to both technical and non-technical stakeholders. Preferred qualifications : - Experience with AI frameworks (e.g., TensorFlow, PyTorch) and understanding of AI/ML pipelines. - Knowledge of hardware accelerators (e.g., GPUs, NPUs) and optimization for low-power AI inferencing. - Experience working in a fast-paced, startup environment is a plus.

Posted 2 months ago

Apply

5 - 10 years

30 - 45 Lacs

Hyderabad

Work from Office

Naukri logo

About the Role: We're seeking an experienced Runtime Engineer to develop and optimize software systems for our silicon platform. This role focuses on building efficient runtime systems that maximize chip performance while ensuring reliability and ease of use. Key Responsibilities: * Design and implement runtime systems for AI accelerator execution and memory management * Develop and optimize runtime libraries for high-performance tensor operations * Create efficient memory allocation and scheduling algorithms for ML workloads * Interface with hardware subsystems through PCIe interface for optimal data transfer * Build and maintain runtime profiling and debugging tools * Work closely with hardware team to optimize end-to-end performance * Document runtime architecture and implementation strategies * Perform thorough testing and performance analysis of runtime components Required Qualifications: * BTech/MTech in Computer Science or Electronics & Communication * 4+ years of experience in systems programming with C/C++ * Strong understanding of concurrent programming and multithreading * Proficiency with debugging and profiling tools (gdb, valgrind, WinDbg, address sanitizer) * Experience with performance optimization and low-level system interfaces * Knowledge of memory management and scheduling algorithms Nice To Have: * Experience with ML frameworks (TensorFlow, PyTorch) and their runtime systems * Understanding of AI/ML workload characteristics * Background in driver development or hardware interfaces What We Offer: * Opportunity to work on cutting-edge high performance compute hardware * Collaborative environment with global teams * Fast-paced and innovation-driven culture * Chance to shape the future of AI acceleration

Posted 2 months ago

Apply

1 - 3 years

2 - 4 Lacs

Bengaluru

Work from Office

Naukri logo

Job Description SW: Python, C++/C#, UI Automation, Windows, Powershell/Batch programing Degree: MTechCollege: VIT Vellore, NITs Qualifications You must be pursuing a Masters degree in Computer Science, Computer Engineering, Software Engineering, Embedded Systems or related field. Preferred qualifications include: o Knowledge in one or more of C/C++/Python programming languageso Knowledge of software design, development, and validation processes.o Knowledge of Operating systems and Computer Architectureo Familiarity with Intel's client platforms and related architecture.o Familiarity with Windows device driver model and architecture is a plus.o Excellent communication and interpersonal skillso Curiosity and eagerness to learn, ability to work in a team Inside this Business Group The Client Computing Group (CCG) is responsible for driving business strategy and product development for Intel's PC products and platforms, spanning form factors such as notebooks, desktops, 2 in 1s, all in ones. Working with our partners across the industry, we intend to deliver purposeful computing experiences that unlock people's potential - allowing each person use our products to focus, create and connect in ways that matter most to them. As the largest business unit at Intel, CCG is investing more heavily in the PC, ramping its capabilities even more aggressively, and designing the PC experience even more deliberately, including delivering a predictable cadence of leadership products. As a result, we are able to fuel innovation across Intel, providing an important source of IP and scale, as well as help the company deliver on its purpose of enriching the lives of every person on earth. Posting Statement All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Posted 2 months ago

Apply

5 - 10 years

9 - 15 Lacs

Pune, Navi Mumbai, Hyderabad

Hybrid

Naukri logo

Job Summary: RightSkale is seeking an image processing engineer to develop and optimize algorithms for real-time medical ultrasound imaging. This position is with an ultrasound company located in the heart of Silicon Valley, USA. You will partner with highly accomplished and deeply technical researchers and engineers focused on delivering innovative ultrasound imaging technologies. The position will also have an opportunity to work directly with the client at their US location. Job Responsibilities (but not limited to): Develop and optimize beamforming, image processing, and AI-based signal and image processing algorithms. This will involve the development of Matlab/Python simulation tools, FPGA and/or GPU software routines. Drive the optimization of imaging parameters through numerical Matlab/Python simulations and/or experimental measurements. Proactively interface with the software team to ensure all technical aspects of algorithm development are properly implemented. Minimum Education/Experience Requirements: M.S./M.Tech in biomedical/electrical engineering in a field related to signal/image processing or analysis. 4-5 years of industrial experience in development, integration, verification, and optimization of medical imaging systems. Experienced with Matlab and Python or equivalent signal/image processing toolbox. Experience with C/C++ is highly desirable. Experience in Ultrasound is highly desirable. Experience with algorithm development with GPU is a plus. Experience with FPGA programming is a plus.

Posted 2 months ago

Apply

6 - 11 years

8 - 14 Lacs

Chennai, Coimbatore

Work from Office

Naukri logo

Mandatory skills : C/C++, CUDA, OpenCL, or other relevant languages for hardware optimization We are seeking a talented engineer to implement and optimize machine learning, computer vision, and numeric libraries for target hardware architecture, including CPUs, GPUs, DSPs, and other accelerators. Your expertise will be instrumental in enabling efficient and high- performance execution of algorithms on these hardware platforms. Key Responsibilities : - Implement and optimize machine learning, computer vision, and numeric libraries for target hardware architectures, including CPUs, GPUs, DSPs, and other accelerators. - Work closely with software and hardware engineers to ensure optimal performance on target platforms. - Implement low- level optimizations, including algorithmic modifications, parallelization, vectorization, and memory access optimizations, to fully leverage the capabilities of the target hardware architectures. - Work with customers to understand their requirements and implement libraries to meet their needs. - Develop performance benchmarks and conduct performance analysis to ensure the optimized libraries meet the required performance targets. - Stay current with the latest advancements in machine learning, computer vision, and high- performance computing. Qualifications : - BTech/BE/MTech/ME/MS/PhD degree in CSE/IT/ECE - 5+ years of experience working in Algorithm Development, Porting, Optimization & Testing - Proficient in programming languages such as C/C++, CUDA, OpenCL, or other relevant languages for hardware optimization. - Hands- on experience with hardware architectures, including CPUs, GPUs, DSPs, and accelerators, and familiarity with their programming models and optimization techniques. - Knowledge of parallel computing, SIMD instructions, memory hierarchies, and cache optimization techniques. - Experience with performance analysis tools and methodologies for profiling and optimization. - Knowledge of deep learning frameworks and techniques is good to have - Strong problem- solving skills and ability to work independently or within a team.

Posted 2 months ago

Apply

2 - 6 years

5 - 8 Lacs

Bengaluru

Work from Office

Naukri logo

Graphics ASIC RTL Design Engineer- Sr Engineer /Sr Lead / Staff / Sr Staff General Summary: You will be implementing the industry's leading edge graphics processor, specific areas include 2D and 3D graphics, streaming processor, high speed IO interface and bus protocols. In this position, the designer will be responsible for architecture and micro-architecture design of the ASIC, RTL design and synthesis, logic and timing verification. The successful candidate for this position will specify and design digital blocks in our Multimedia Graphics team that will be integrated into a broad range of devices. All Qualcomm employees are expected to actively support diversity on their teams, and in the Company. Minimum Qualifications Bachelor's degree in Science, Engineering, or related field Previous experience in designing GPU or CPU cores and ASICs for Multimedia and Graphics applications in deep sub-micron CMOS processes for volume production Experience with Verilog/VHDL design, Synopsys synthesis, static timing analysis, formal verification, low power design, test plan development, coverage-based design verification, and/or design-for-test (DFT) Experience with Computer Architecture, Computer Arithmetic, C/C++ programming languages is desiredExposure to DX9~12 level graphics HW development is big plus Good communication skill and desire to work as a team player Required: Bachelor's degree in Computer Science, Electrical Engineering, Information Systems, or related field.Preferred: Master's degree in Computer Science, Electrical Engineering, Information Systems, or related field. ASIC, hardware, design, GPU, OpenGL, DirectX, RTL, Verilog, System Verilog Minimum Qualifications: Bachelor's degree in Computer Science, Electrical/Electronics Engineering, Engineering, or related field and 4+ years of Hardware Engineering or related work experience. OR Master's degree in Computer Science, Electrical/Electronics Engineering, Engineering, or related field and 3+ years of Hardware Engineering or related work experience. OR PhD in Computer Science, Electrical/Electronics Engineering, Engineering, or related field and 2+ years of Hardware Engineering or related work experience.

Posted 2 months ago

Apply

5 - 8 years

7 - 11 Lacs

Bengaluru

Work from Office

Naukri logo

Candidate must be experienced to fine tune / Optimize the open-source models like whisper, to run on mobile platforms (which have limited memory and CPU/GPU), without much guidance / supervision. Looking at people who have already tuned Gen AI models for a specific use case.

Posted 3 months ago

Apply

6 - 10 years

8 - 12 Lacs

Bengaluru

Work from Office

Naukri logo

Develops the logic design, register transfer level (RTL) coding, and simulation for graphics IPs (including graphics, compute, display, and media) required to generate cell libraries, functional units, and the GPU IP block for integration in full chip designs. Participates in the definition of architecture and microarchitecture features of the block being designed. Applies various strategies, tools, and methods to write RTL and optimize logic to qualify the design to meet power, performance, area, and timing goals as well as design integrity for physical implementation. Reviews the verification plan and implementation to ensure design features are verified correctly across verification hierarchies, drives unit level verification, and resolves and implements corrective measures for failing RTL tests to ensure correctness of features. Supports SoC customers to ensure high quality integration of the GPU block. Qualifications Minimum qualifications are required to be initially considered for this position. Preferred qualifications are in addition to the minimum requirements and are considered a plus factor in identifying top candidates. Minimum Qualifications: B.Tech/M.Tech +6 Years of relevant industry experience. Having achieved multiple tape-outs reaching production with first pass silicon. Ability to drive and improve digital design methodology to achieve high quality first silicon. Hands on experience with FPGA emulation, silicon bring-up, characterization and debug. Have experience working in GPU/CPU domain. Able to work with multi-functional teams within Intel and external vendors across geographical boundaries to resolve architectural and implementation challenges with a focus on schedule. Strong verbal and written communication skills. Good understanding of verilog and system verilog, synthesizable RTL. Knowledgeable in modern design techniques and energy-efficient/low power logic design and power analysis. Requirements listed would be obtained through a combination of industry relevant job experience, internship experiences and or schoolwork/classes/research. Inside this Business Group The Client Computing Group (CCG) is responsible for driving business strategy and product development for Intel's PC products and platforms, spanning form factors such as notebooks, desktops, 2 in 1s, all in ones. Working with our partners across the industry, we intend to deliver purposeful computing experiences that unlock people's potential - allowing each person use our products to focus, create and connect in ways that matter most to them. As the largest business unit at Intel, CCG is investing more heavily in the PC, ramping its capabilities even more aggressively, and designing the PC experience even more deliberately, including delivering a predictable cadence of leadership products. As a result, we are able to fuel innovation across Intel, providing an important source of IP and scale, as well as help the company deliver on its purpose of enriching the lives of every person on earth. Posting Statement All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Posted 3 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies