Posted:5 days ago|
                                Platform:
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                                
                            
On-site
Full Time
Deep Learning Model Conversion: Convert and adapt deep learning network architectures (e.g., from PyTorch) for deployment on various embedded platforms.
Quantization-Aware Training (QAT): Implement and fine-tune Quantization-Aware Training techniques to optimize model performance and reduce memory footprint while maintaining accuracy.
Model Optimization: Perform extensive model optimization techniques, including pruning, quantization (post-training and QAT), and network architecture search, to achieve desired latency, power, and memory targets.
Runtime Integration: Integrate optimized deep learning models with embedded runtime environments and hardware accelerators.
Performance Profiling & Tuning: Analyze and profile model performance on target embedded hardware, identifying bottlenecks and implementing solutions for real-time inference.
Number Format Conversion: Work with various number formats (e.g., FP32, FP16, INT8) and develop strategies for efficient conversion and utilization on embedded processors.
Toolchain Development & Utilization: Utilize and contribute to the development of custom conversion tools and optimization scripts to streamline the deployment pipeline.
Experience: 5+ years of experience in embedded software development with a strong focus on AI/Machine Learning deployment.
Programming Skills: Proficient in Python for AI development and scripting.
Deep Learning Frameworks: Hands-on experience with deep learning frameworks such as PyTorch. Experience with TensorFlow/Keras is a plus.
Embedded Systems: Strong understanding of embedded system architectures, microcontrollers, DSPs, and/or FPGAs.
Optimization Techniques: Proven experience with deep learning model optimization techniques (quantization, pruning, knowledge distillation).
Number Formats: Familiarity with different number formats (e.g., FP32, FP16, INT8) and their implications for embedded inference.
Conversion Tools: Experience with model conversion tools (e.g., ONNX, OpenVINO, TensorRT, TVM).
Problem-Solving: Excellent analytical and problem-solving skills, with a strong ability to debug and optimize complex systems.
Experience with C/C++ for embedded development.
Familiarity with hardware acceleration (e.g., NPUs, GPUs on edge devices).
 
                Valeo
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
 
        Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
 
            
         
                        
                     
    chennai, tamil nadu, india
Salary: Not disclosed
chennai, tamil nadu, india
Salary: Not disclosed
hyderabad
10.0 - 20.0 Lacs P.A.
Bengaluru, Karnataka, India
6.0 - 10.0 Lacs P.A.
1.0 - 1.0 Lacs P.A.
Dwarka, Delhi, Delhi
Salary: Not disclosed
Bengaluru
7.0 - 11.0 Lacs P.A.