Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
4.0 - 9.0 years
30 - 45 Lacs
hyderabad
Work from Office
Role Overview: The Staff Engineer will be responsible for architecting and implementing advanced quantization algorithms for edge AI applications. You will lead technical initiatives, mentor junior team members, and drive continuous improvement in model compression and optimization techniques for LLMs and other deep learning models. Key Responsibilities: Architectural Leadership: o Design and develop robust quantization strategies and algorithms for AI inference on edge devices. o Lead system-level design discussions and collaborate closely with hardware and research teams. Mentorship & Code Review: o Mentor mid-level and junior engineers, providing technical guidance and best practices. o Conduct thorough code reviews and ensure high standards of quality and performance. Innovation & Optimization: o Stay abreast of the latest research in model quantization and compression, and drive the adoption of innovative techniques. o Develop and maintain performance benchmarks, and continuously optimize algorithms for low latency and high energy efficiency. Cross-Functional Collaboration: o Work with the Quantizer Group Manager and Tech Lead to align technical roadmaps with product objectives. o Participate in regular strategy sessions to set technical direction and priorities. Qualifications: Bachelors or Masters degree in Computer Science, Electrical Engineering, or a related field (Ph.D. is a plus). 5-8+ years of industry experience in deep learning, model optimization, or related areas. Demonstrated experience with quantization techniques, LLM optimization, and software development using Python/C++. Strong problem-solving skills and a passion for innovation in edge AI technologies. What We Offer: An opportunity to work on pioneering edge AI technologies that redefine the future of real-time inference. A collaborative environment where innovation is at the core of our culture. Competitive compensation, comprehensive benefits, and significant opportunities for professional growth.
Posted 6 days ago
3.0 - 8.0 years
3 - 8 Lacs
hyderabad, telangana, india
On-site
We are looking for an AI Compiler Engineer to join this high impact team working in the growing field of on-device AI inference acceleration as an individual contributor or as a technical lead in the AI group (AIG). As an AI Compiler Engineer, you will design and optimize AI compiler stack and tools that enable efficient execution of state-of-the-art open source as well as proprietary AI models such as LLMs, transformer models, etc., to AMD NPUs for on-device AI inference use-cases. You will work on transforming high-level AI models into efficient, low-level code that can run on NPU. Your work will directly impact the performance, efficiency, and scalability of our AI solutions. SCOPE OF WORK: Operator Fusion: Identify and implement performance optimization opportunities by reducing memory traffic through operator fusion at different memory hierarchy levels, e.g., attention block. Problem Partitioning and Dataflow Orchestration: Design of algorithms to optimally map given AI operation to the NPU comprising of an interconnected array of AI engines. Design and implementation of algorithms to orchestrate dataflow through multi-level memory hierarchy. Kernel Design and Development: Design and implement highly optimized C++/intrinsic based kernels for AI related operators. Develop vectorized code that leverages SIMD (Single Instruction, Multiple Data) and VLIW (Very Long Instruction Word) for optimal performance. Perform performance, program memory, and accuracy tradeoffs. Testing and Validation: Develop CPU models for the ML operators in C++/Python to validate accuracy. Write unit tests and integration tests to ensure correctness and reliability. Performance Profiling and Tuning: Profile and analyze the performance of model layers. Identify performance/accuracy bottlenecks and alleviate those. Documentation and Collaboration: Effective technical communication of day-to-day work and document design specs. Follow good coding practices, using version control system. Collaborate with cross-functional teams spanning over AI research, core architecture, and software engineering. REQUIRED SKILLS: Excellent C/C++ and Python coding skills . Good understanding of SIMD, VLIW processor architecture . Experience with vectorized programming (SIMD) . Thorough understanding of fixed and floating point arithmetic . Good understanding of various operators in state-of-the-art AI models. Knowledge of low-level hardware details ( cache hierarchy, DMA programming ). Excellent problem-solving skills, especially on debug, and a passion for on-device AI. Prefer candidates with past experience on AI compiler design .
Posted 1 week ago
4.0 - 8.0 years
0 Lacs
karnataka
On-site
You are a skilled C++ developer with over 4 years of experience, specializing in high-performance, low-latency systems. Your expertise lies in modern C++ (C++14/17/20), multithreading, and concurrency. You have a strong background in Qt development, particularly in building real-time, high-performance trading user interfaces. Your experience includes creating ultra-fast order execution engines, market data feeds, and real-time risk management tools. In addition to your technical skills, you possess a deep understanding of networking protocols such as TCP/IP, UDP, and FIX, as well as interprocess communication methods like IPC, shared memory, and message queues. You have hands-on experience in latency optimization, performance tuning, and utilizing tools like perf, Valgrind, and gprof. Proficiency in memory management, lock-free programming, and CPU cache optimization is also part of your skill set. You have a hacker mentality and enjoy tackling challenging problems. Your responsibilities will include architecting, developing, and optimizing ultra-low-latency C++ trading applications capable of handling millions of transactions per second. You will build high-performance market data processing solutions with microsecond-level latencies and develop real-time, intuitive, and high-speed trading interfaces using Qt. Your work will involve exchange connectivity, FIX protocol integrations, and risk management systems. You will be expected to profile and optimize code to achieve maximum throughput and minimal latency, working alongside an elite team to solve some of the hardest engineering problems in the fintech industry. Experimenting with new technologies to stay ahead of the competition and owning your work end-to-end, from concept to deployment, are also key aspects of this role. Ideally, you have experience in high-frequency trading (HFT), market-making, or ultra-low-latency environments. Knowledge of exchange matching algorithms, order routing strategies, and market microstructure will be beneficial. Contributions to open-source C++ and Qt projects or performance-critical software, as well as expertise in hardware acceleration (FPGA, SIMD, AVX, GPU computing), are highly valued. Familiarity with cloud-based trading infrastructure and hybrid on-prem/cloud systems is a plus. As part of a high-energy startup with significant growth potential, you will work with visionary fintech leaders and top-tier engineers to build industry-defining products that will shape the future of trading. The culture values bold ideas, rapid execution, and relentless optimization. If you are passionate about performance, enjoy pushing speed barriers, and aspire to contribute to something significant, this is an opportunity to be part of a team that is reshaping the future of trading. Join us in disrupting the industry together. Apply now for this full-time position with a day shift schedule and an in-person work location.,
Posted 2 weeks ago
3.0 - 7.0 years
0 Lacs
maharashtra
On-site
You will be working within a team to develop high-quality video game software at Ubisoft. Your responsibilities will include developing and implementing independent modules, implementing audio systems, ensuring proper integration of all sound effects and music, and troubleshooting audio-related issues. You will collaborate effectively with diverse teams, including audio designers, gameplay programmers, and other developers. Analyzing audio-related issues, implementing effective solutions, and having a strong understanding of audio flow, programming, and game engine technologies are key aspects of this role. You will also be responsible for dealing with audio performance issues and optimization. To excel in this role, you must have strong C/C++ skills with object-oriented/data-oriented programming skills, experience working with large-scale game engines such as Unreal, Anvil, and Snowdrop, and knowledge of low-level audio programming. Having shipped multiple games on consoles or mobile platforms, high aptitude, strong analytical skills, and familiarity with performance bottlenecks, multi-threading, OS concepts, and system programming are essential. Strong debugging and troubleshooting abilities, self-motivation, curiosity, and adaptability to new technologies are also required. Ideally, you should possess a master's or bachelor's degree in computer science from a reputable institute or relevant work experience. Additional advantage will be given to candidates with gameplay programming and game development experience, knowledge of video game developments and engines, understanding of low-level audio technologies like DSP and SIMD, familiarity with audio plugins and their implementation, and previous work on game consoles. Your application and all information provided will be kept confidential in accordance with EEO guidelines.,
Posted 2 weeks ago
15.0 - 19.0 years
0 Lacs
karnataka
On-site
MIPS is looking for a highly experienced and motivated Engineering Manager to lead the CPU team. In this critical role, you will manage a team of skilled engineers dedicated to developing high-performance CPU cores based on the RISC-V architecture. Your responsibilities will include overseeing the entire RTL design lifecycle, ensuring technical excellence, innovation, and timely delivery. You will lead and mentor a team of microarchitecture and RTL design engineers focused on high-performance CPU core development. Your role will involve driving the design execution of CPU subsystems, meeting performance, power, area, and quality goals. You will be responsible for creating and refining microarchitecture specifications and RTL designs for CPU core components, collaborating with various teams to ensure efficient implementation, optimizing design methodologies for scalability and quality, and fostering a culture of innovation and technical excellence. The ideal candidate should have a Master's or PhD in Electrical Engineering, Computer Engineering, or a related field, with at least 15 years of experience in silicon or CPU design, including 5+ years in a management role. A proven track record of successfully delivering complex CPU blocks or subsystems, deep understanding of microprocessor design principles, strong technical knowledge in RTL design, and experience with project management and organizational skills are required. Preferred qualifications include experience with RISC-V, MIPS, or ARM CPU cores, familiarity with advanced CPU design concepts, working in a globally distributed environment, proficiency with EDA tools and scripting languages, and modern design methodologies. Joining MIPS offers you the opportunity to work with a dynamic and fast-growing team that is shaping the future of high-performance RISC-V processors. With small, agile teams and a flat organizational structure, you will have a direct impact on cutting-edge technology and product direction. MIPS provides an autonomous and empowering work culture, the chance to collaborate with industry-leading engineers, competitive compensation and benefits, and a platform for career growth in the exciting RISC-V ecosystem.,
Posted 1 month ago
3.0 - 8.0 years
0 Lacs
hyderabad, telangana
On-site
The responsibilities of this position may include: Designing computer vision/image processing solutions for embedded devices. Developing and evaluating algorithms for implementation in hardware or software prototypes. Optimizing image processing and computer vision algorithms for hardware acceleration. Supporting product teams in the commercialization process, including solution optimization, performance profiling, and benchmarking. Providing test regression and release support. Join Qualcomm India and be a part of the growing multimedia systems team, working on innovative solutions to enhance mobile multimedia capabilities with high performance, low power consumption, and cost efficiency, while maintaining strong feature differentiation. We are looking for system engineers to contribute to our cutting-edge projects in the field of vision/image processing. The ideal candidate should have: 3-8 years of experience in multimedia and embedded technologies. Proficiency in programming languages such as C/C++ and Python. Experience working with real-time and embedded systems is a plus. Familiarity with Jenkins and CI/CD frameworks. Preferred qualifications include: Exposure or working experience with Vision or Multimedia accelerators. Hands-on experience with image processing algorithms. Knowledge or experience in computer vision algorithms. Strong understanding of data structures and proficiency in C/C++ programming. Experience in software optimization using various SIMD and multi-threading techniques.,
Posted 1 month ago
1.0 - 8.0 years
0 Lacs
hyderabad, telangana
On-site
Qualcomm India Private Limited is seeking system engineers to join their growing multimedia systems team. As a part of this team, you will be involved in innovating to enhance mobile multimedia capabilities with a focus on higher performance, lower power consumption, and device cost optimization while providing strong feature differentiation. The company is looking for individuals with 3-8 years of experience in multimedia/embedded technologies, proficient in C/C++ and Python programming, and preferably with experience in real-time/embedded systems. Knowledge of Jenkins and CI/CD frameworks is also desirable. Key responsibilities of this role include designing computer vision/image processing for embedded devices, evaluating algorithms for implementation in hardware or software prototypes, developing or optimizing image processing and computer vision algorithms for hardware acceleration, supporting product teams for commercialization through solution optimization and benchmarking, as well as test regression and release support. Preferred qualifications for this position include exposure or working experience in vision or multimedia accelerators, working experience with image processing algorithms, knowledge or working experience in computer vision algorithms, strong knowledge in data structures, experience with C/C++ programming, and software optimizations in various SIMD and multi-threading technologies. Minimum qualifications for the role include a Bachelor's degree in Engineering, Information Systems, Computer Science, or related field with 3+ years of Systems Engineering or related work experience, or a Master's degree with 2+ years of experience, or a PhD with 1+ year of experience in Systems Engineering or related field. Qualcomm is an equal opportunity employer and is committed to providing accessible processes for individuals with disabilities. If you require accommodations during the application/hiring process, you may contact Qualcomm for support. The company expects its employees to adhere to all applicable policies and procedures, including those related to security and protection of confidential information. Please note that Qualcomm's Careers Site is intended for individuals seeking employment directly with Qualcomm. Staffing and recruiting agencies or individuals represented by agencies are not authorized to use this site for submissions. Unsolicited applications or resumes from agencies will not be accepted. For further information about this role, you can reach out to Qualcomm Careers directly.,
Posted 2 months ago
5.0 - 9.0 years
25 - 40 Lacs
Chennai
Hybrid
Key Skills: Cluster, C++, Java, SIMD, AVX, OpenCV, OpenMP, OpenCL. Roles and Responsibilities: Understand project specifications and performance requirements and drive adherence to project timelines, ensuring program milestones are achieved on schedule. Lead and expand a team of software engineers while ensuring effective collaboration within the team. Manage time and organizational tasks efficiently to meet development objectives. Adapt swiftly to changes within the fast-paced high-performance computing industry. Identify and address bottlenecks in data movement, code execution, and job scheduling. Leverage in-depth knowledge of Linux systems and internals to support HPC infrastructure. Apply proficiency in Shell and Python scripting to support testing and automation. Ensure optimized performance of parallel and distributed computing applications. Experience Requirement: 3 to 5 years of validated experience in HPC or a related engineering domain. Strong Object-Oriented programming skills in Java and/or C++. Proven proficiency in parallel programming and distributed computing techniques. Strong understanding of HPC hardware including servers, GPUs, networking, storage, BIOS, and BMC. Experience with performance profilers such as Intel vTune, Nvidia Nsight Compute, AMD uProf, and perf is preferred. Familiarity with libraries such as SIMD, AVX, IPP, MKL, OpenCV, OpenMP, OpenCL, MPI, and CUDA is advantageous. Experience with observability tools for distributed systems like Prometheus and Grafana is a plus. Demonstrated ability to work effectively in a team, manage tasks efficiently, and adapt to evolving priorities. Education: B.Tech M.Tech (Dual), B.E., B.Tech.
Posted 2 months ago
5.0 - 8.0 years
6 - 10 Lacs
Bengaluru
Work from Office
What We Expect 4+ years of experience in C++ development, specializing in high-performance, low-latency systems. Deep expertise in modern C++ (C++14/17/20), multithreading, and concurrency. Strong Qt development experience for building real-time, high-performance trading UIs. Experience building ultra-fast order execution engines, market data feeds, and real-time risk management tools. Strong understanding of networking protocols (TCP/IP, UDP, FIX) and inter process communication (IPC, shared memory, message queues). Hands-on experience with latency optimization, performance tuning, and profiling tools (perf, Val grind, gprof, etc.). Proficiency in memory management, lock-free programming, and CPU cache optimization. A deep understanding of exchange connectivity, order matching engines, and algorithmic trading systems. A hacker mentality-you love solving problems that seem impossible. What You Will Do Architect, develop, and optimize ultra-low-latency C++ trading applications that handle millions of transactions per second. Build high-performance market data processing solutions with microsecond-level latencies. Develop real-time, intuitive, and high-speed trading interfaces using Qt. Work on exchange connectivity, FIX protocol integrations, and risk management systems. Profile and optimize code to achieve maximum throughput and minimal latency. Solve some of the hardest engineering problems in fintech alongside an elite team. Experiment with new technologies to stay ahead of the competition. Own your work end-to-end-from concept to deployment, pushing the limits of what's possible. Must-Have Skills 4+ years of experience in C++ development, specializing in high-performance, low-latency systems. Deep expertise in modern C++ (C++14/17/20), multithreading, and concurrency. Strong Qt development experience for building real-time, high-performance trading UIs. Experience building ultra-fast order execution engines, market data feeds, and real-time risk management tools. Strong understanding of networking protocols (TCP/IP, UDP, FIX) and inter process communication (IPC, shared memory, message queues). Hands-on experience with latency optimization, performance tuning, and profiling tools (perf, Val grind, gprof, etc.). Nice-to-Have Skills Experience in high-frequency trading (HFT), market-making, or ultra-low-latency environments. Knowledge of exchange matching algorithms, order routing strategies, and market microstructure. Contributions to open-source C++ and Qt projects or performance-critical software. Expertise in hardware acceleration (FPGA, SIMD, AVX, GPU computing). Familiarity with cloud-based trading infrastructure and hybrid on-prem/cloud systems.
Posted 3 months ago
5.0 - 10.0 years
80 - 150 Lacs
Hyderabad
Hybrid
Staff IP/RTL Design Engineer (AI Accelerator) Hyderabad Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore A US based well-funded product-based startup looking for Highly talented Senior Physical Fri, Mar 28 at 9:39 AM Principal / Staff IP/RTL Design Engineer (AI Accelerator) Multiple positions - Hyderabad Well-funded product startup is looking for RTL Design Engineers to contribute to the development of novel high performance AI accelerators from scratch. In this role you will collaborate with cross-functional teams, including architect, software, verification, physical design, systems engineers, to define and implement next generation AI architectures. We are seeking highly experienced individuals who have a passion for innovation and are excited about the opportunity to create world class products from India. The key responsibilities for this role include, but are not limited to: Key Responsibilities Design and implement high-performance TPUs/MPUs and other related AI blocks using RTL. Own IP/block-level RTL from spec to GDS, including design, synthesis, and timing closure. Optimize design for power, performance, and area (PPA). Interface with physical design and DFT (Design for Test) engineers for seamless integration. Drive design reviews, write design documentation, and support post silicon bring-up/debug. Minimum Qualifications B.S./M.S./Ph.D. in ECE/CS from top engineering college with 5-15 years of related experience. Previous experience in either high performance processor design or AI accelerator design is plus. Clear understanding of floating-point arithmetic, vector processing, SIMD, MIMD, VLIW, EPIC concepts. Strong grasp of digital design fundamentals, computer architecture, virtual memory and high-speed data-path design. Proficiency in Verilog/SystemVerilog and simulation tools. Experience with EDA tools (e.g., Synopsys, Cadence) for synthesis, lint, CDC, and timing analysis. What is in it for you? Pure play product work environment Chance to work with a tightly knit group of exceptional engineers who come from the top companies of the Semiconductor world Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. A meritocracy first work place where each peer is a star A chance to be a part of industry shaping product in entirety (not bits and pieces) from initial stages A chance to work at a startup which already has customers and investor lined up for their product pipeline (We do not have a marketing/sales team, because we do not need them). A chance to learn from industry veterans who have already launched multiple Billion Dollar Semiconductor firms over the last 3 decades. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"
Posted 3 months ago
10.0 - 20.0 years
80 - 150 Lacs
Hyderabad
Hybrid
Principal / Staff IP/RTL Design Engineer (AI Accelerator) Hyderabad Founded by highly respected Silicon Valley veterans - with its design centers established in Santa Clara, California. / Hyderabad/ Bangalore A US based well-funded product-based startup looking for Highly talented Senior Physical Fri, Mar 28 at 9:39 AM Principal / Staff IP/RTL Design Engineer (AI Accelerator) Multiple positions - Hyderabad Well-funded product startup is looking for RTL Design Engineers to contribute to the development of novel high performance AI accelerators from scratch. In this role you will collaborate with cross-functional teams, including architect, software, verification, physical design, systems engineers, to define and implement next generation AI architectures. We are seeking highly experienced individuals who have a passion for innovation and are excited about the opportunity to create world class products from India. The key responsibilities for this role include, but are not limited to: Key Responsibilities Design and implement high-performance TPUs/MPUs and other related AI blocks using RTL. Own IP/block-level RTL from spec to GDS, including design, synthesis, and timing closure. Optimize design for power, performance, and area (PPA). Interface with physical design and DFT (Design for Test) engineers for seamless integration. Drive design reviews, write design documentation, and support post silicon bring-up/debug. Minimum Qualifications B.S./M.S./Ph.D. in ECE/CS from top engineering college with 5-15 years of related experience. Previous experience in either high performance processor design or AI accelerator design is plus. Clear understanding of floating-point arithmetic, vector processing, SIMD, MIMD, VLIW, EPIC concepts. Strong grasp of digital design fundamentals, computer architecture, virtual memory and high-speed data-path design. Proficiency in Verilog/SystemVerilog and simulation tools. Experience with EDA tools (e.g., Synopsys, Cadence) for synthesis, lint, CDC, and timing analysis. What is in it for you? Pure play product work environment Chance to work with a tightly knit group of exceptional engineers who come from the top companies of the Semiconductor world Our pay comprehensively beats "ALL" Semiconductor product players in the Indian market. A meritocracy first work place where each peer is a star A chance to be a part of industry shaping product in entirety (not bits and pieces) from initial stages A chance to work at a startup which already has customers and investor lined up for their product pipeline (We do not have a marketing/sales team, because we do not need them). A chance to learn from industry veterans who have already launched multiple Billion Dollar Semiconductor firms over the last 3 decades. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"
Posted 3 months ago
17 - 27 years
100 - 200 Lacs
Bengaluru
Work from Office
Senior Software Technical Director / Software Technical Director Bangalore Founded in 2023,by Industry veterans HQ in California,US We are revolutionizing sustainable AI compute through intuitive software with composable silicon We are looking for a Software Technical Director with a strong technical foundation in systems software, Linux platforms, or machine learning compiler stacks to lead and grow a high-impact engineering team in Bangalore. You will be responsible for shaping the architecture, contributing to codebases, and managing execution across projects that sit at the intersection of systems programming, AI runtimes, and performance-critical software. Key Responsibilities: Technical Leadership: Lead the design and development of Linux platform software, firmware, or ML compilers and runtimes. Drive architecture decisions across compiler, runtime, or low-level platform components. Write production-grade C++ code and perform detailed code reviews. Guide performance analysis and debugging across the full stackfrom firmware and drivers to user-level runtime libraries. Collaborate with architects, silicon teams, and ML researchers to build future-proof software stacks. Team & Project Management: Mentor and coach junior and senior engineers to grow technical depth and autonomy. Own end-to-end project planning, execution, and delivery, ensuring high-quality output across sprints/releases. Facilitate strong cross-functional communication with hardware, product, and other software teams globally. Recruit and grow a top-tier engineering team in Bangalore, contributing to the hiring strategy and team culture. Required Qualifications: Bachelors or Master’s degree in Computer Science, Electrical Engineering, or related field. 18+ years of experience in systems software development with significant time spent in C++, including architectural and hands-on roles. Proven experience in either: Linux kernel, bootloaders, firmware, or low-level platform software, or Machine Learning compilers (e.g., MLIR, TVM, Glow) or runtimes (e.g., ONNX Runtime, TensorRT, vLLM). Excellent communication skills—written and verbal. Prior experience in project leadership or engineering management with direct reports. Highly Desirable: Understanding of AI/ML compute workloads, particularly Large Language Models (LLMs). Familiarity with performance profiling, bottleneck analysis, and compiler-level optimizations. Exposure to AI accelerators, systolic arrays, or vector SIMD programming. Why Join Us? Work at the forefront of AI systems software, shaping the future of ML compilers and runtimes. Collaborate with globally distributed teams in a fast-paced, innovation-driven environment. Build and lead a technically elite team from the ground up in a growth-stage organization. Contact: Uday Mulya Technologies muday_bhaskar@yahoo.com "Mining The Knowledge Community"
Posted 4 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |