General Summary:
The Qualcomm Cloud AI team is developing hardware and software for Machine Learning solutions spanning the data center, edge, infrastructure, automotive market. We are seeking ambitious, bright, and innovative engineers with experience in machine learning framework development. Job activities span the whole product life cycle from early design to commercial deployment. The environment is fast-paced and requires cross-functional interaction daily so good communication , planning and execution skills are a must .
We are seeking a highly skilled and motivated Language Model Engineer to join our team. The primary role of the engineer will be to train Large Language Models (LLMs) from scratch and fine-tune existing LLMs on various datasets using state-of-the-art techniques.
- Model architecture optimizations : optimize latest LLM and GenAI model architectures for NPUs, which involves reimplementing basic building blocks of models for NPUs
- Model Training and Fine-tuning: ?Fine- tune pre-trained models on specific tasks or datasets to improve performance. Implement state-of-the-art LLM training techniques such as Reinforcement Learning from Human Feedback (RLHF), ZeRO (Zero Redundancy Optimizer), Speculative Sampling, and other speculative techniques.
- Data Management: ?Handle large datasets effectively. Ensure data quality and integrity. Implement data cleaning and preprocessing techniques. Hands-on with EDA is a plus.
- Model Evaluation: ?Evaluate model performance using appropriate metrics . Understand the trade-offs between different evaluation metrics.
- LLM metrics: Sound understanding of various LLM metrics like MMLU, Rouge, BLEU, Perplexity etc.
- AWQ: Understanding of Quantization is a plus. Knowledge on QAT will be a plus.
- Research and Development: ? Stay up to date with the latest research in NLP and LLMs. Implement state-of-the-art techniques and contribute to research efforts.
- Infrastructure development : For coming up with new optimization techniques to minimize ONNX memory footprint, export time optimizations.
- Collaboration: ?Work closely with other teams to understand requirements and implement solutions.
Required Skills and Experience:
- Optimization: ?Knowledge of optimization techniques for training large models.
- Neural Architecture Search (NAS): ?Experience with NAS techniques for optimizing model architectures is a plus.
- Hands-on experience with CUDA, CUDNN and Triton-lang is a plus.
Minimum Qualifications:
- Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 2+ years of Software Engineering or related work experience.
- OR
- Master's degree in Engineering, Information Systems, Computer Science, or related field and 1+ year of Software Engineering or related work experience.
- OR
- PhD in Engineering, Information Systems, Computer Science, or related field.
- 2+ years of academic or work experience with Programming Language such as C, C++, Java, Python, etc.