AI Systems Engineer – GPU/ROCm/CUDA | ML Frameworks Optimization

2 years

30 Lacs

Posted:1 week ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: AI Systems Engineer – GPU/ROCm/CUDA | ML Frameworks Optimization

  • Location: Hyderabad
  • Experience : 3-6 [Mid-Senior]

Job Description:

We are looking for a passionate and experienced AI Systems Engineer to join our team to work on next-generation Machine Learning technologies and optimize performance across AMD GPU accelerators. This role involves low-level GPU programming, custom ML kernel development, and working with state-of-the-art inference engines.

Key Responsibilities:

  • Develop and optimize custom Deep Learning GPU kernels using ROCm/CUDA or shader languages
  • Support and enhance ML model deployment on Linux platforms
  • Optimize performance of ROCm drivers and inferencing engines for AI/ML workloads
  • Collaborate closely with internal hardware/software teams to support next-gen GPU accelerators
  • Profile, debug, and improve performance of GPU kernels and AI model pipelines
  • Contribute to designing and implementing new AI technologies and workflows

Required Skills & Qualifications:

  • BS/MS in Computer Science, Electrical Engineering, or equivalent
  • Strong programming skills in C/C++, Python
  • Solid experience working with Linux CLI, bash scripting, or PowerShell
  • Hands-on experience with Python ML libraries such as PyTorch, Transformers
  • Knowledge of writing high-performance ML kernels using Triton, JAX, or similar
  • Experience with debugging tools like gdb, valgrind, and profiling tools such as nsys, rocprof
  • Familiarity with AI inferencing runtimes such as vllm, ollama, llama.cpp, or sglang
  • Understanding of GPU and PC architecture, x86/x64 instruction sets
  • Experience developing with ROCm, CUDA, or shader programming

Nice to Have:

  • Knowledge of x86 Assembly
  • Contributions to open-source ML/DL performance libraries
  • Exposure to compiler optimization techniques for GPU code

What We Offer:

  • Work on cutting-edge GPU technologies and ML systems
  • Exposure to performance-critical AI workloads
  • Collaborative and research-oriented environment
  • Competitive compensation and career growth opportunities

Apply: If you are looking for job change share your updated resume to vagdevi@semi-leaf.com

Job Type: Full-time

Pay: Up to ₹3,000,000.00 per year

Experience:

  • Deep Learning GPU kernels using ROCm/CUDA: 2 years (Required)
  • programming skills in C/C++, Python: 1 year (Required)
  • Python ML libraries such as PyTorch, Transformers: 1 year (Required)
  • developing with ROCm, CUDA, : 1 year (Required)

Work Location: In person

Speak with the employer
+91 7483459258

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You