Principal AIML Engineer

4 - 8 years

4 - 8 Lacs

Posted:1 month ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a Principal Machine Learning Engineer with deep expertise in Natural Language Processing (NLP), Large Language Models (LLMs), and advanced deep learning techniques. This role demands a visionary leader who can architect, scale, and deliver large-scale machine learning systems with a focus on business outcomes, high reliability, and cutting-edge innovation. You will lead the strategic development of ML systems, optimizing for both performance and business impact, while architecting large-scale, production-grade AI solutions. Key Responsibilities: End-to-End LLM System Architecture: Design and lead the development of advanced LLM-based systems, using models such as LLaMA, Mistral, and Azure OpenAI. Architect these solutions to support business objectives at scale, focusing on optimizing performance, resource utilization, and reliability across diverse environments. Scaling AI Solutions: Lead efforts to scale machine learning solutions across enterprise-wide applications. Establish methodologies to manage increasingly complex data pipelines, model lifecycle management, and deployable architectures, ensuring fault tolerance and minimal downtime. High-Impact ML Deployment: Drive large-scale deployment of LLMs and other ML models in cloud and hybrid environments (AWS, GCP, Azure). Build highly scalable, secure, and reliable AI infrastructure, ensuring systems meet stringent SLAs and production quality standards. Deep Learning System Optimization: Architect and optimize deep learning systems, enhancing performance using techniques such as distributed training, model pruning, quantization, and hyperparameter tuning. Drive the development of tools and frameworks that improve deployment speed and system efficiency. Business Alignment: Lead the alignment of ML solutions with business strategies, focusing on maximizing ROI from AI investments. Translate business problems into technical solutions, ensuring that AI systems deliver measurable value. AI Innovation Leadership: Spearhead the exploration of new ML models, techniques, and platforms, driving the next generation of AI applications. Lead advanced research in multimodal LLMs, self-supervised learning, and model interpretability to create cutting-edge business solutions. Cross-Functional Leadership & Collaboration: Mentor and guide AI engineers and machine learning scientists, fostering a culture of collaboration and technical excellence. Build strong relationships with product teams, data engineering, and operations to deliver seamless AI solutions across the enterprise. System Reliability and Optimization: Ensure high system availability by implementing monitoring, logging, and alerting strategies. Architect scalable and highly available systems capable of processing large volumes of data in real time. ML Governance and Compliance: Oversee the governance of AI systems, ensuring ethical deployment and compliance with data privacy regulations. Implement processes to ensure model auditability, explainability, and bias mitigation. Technical Expertise: Advanced NLP & LLMs: Deep expertise in building, fine-tuning, and deploying large language models such as the LLaMA and Mistral families, including advanced knowledge of techniques like reinforcement learning, model distillation, and transfer learning. Distributed Systems & Parallelization: Expert knowledge of distributed systems and parallelized model training, enabling large-scale deployment and training of models across multi-cloud infrastructures. Deep Learning Frameworks: Advanced experience with PyTorch, TensorFlow, and other ML libraries. Strong software engineering and system design skills, including proficiency with FastAPI, test-driven development (TDD), and CI/CD pipelines. Cloud and Infrastructure Expertise: Experience architecting fault-tolerant, distributed machine learning systems across multi-cloud platforms. Expertise in Kubernetes, Docker, and other containerization technologies for large-scale deployments. Must-Have Skills Excellent communication skills to articulate complex technical details to non-technical stakeholders. Excellent team player and ability to guide, nurture other members of the team In-depth understanding of architecting scalable and reliable solutions Willing to work from Office Nice to haves: Multimodal and Vision Models: Experience in deploying multimodal AI solutions that integrate text, images, video, and audio into cohesive applications for cross-domain problems. AI in Production: Proven track record of managing large-scale ML systems in production, optimizing for uptime, performance, and cost-efficiency across global infrastructures. Experience: 8+ years with a Bachelors degree, 6+ years (Master s), or 4+ years (PhD) preferably in Computer science/Data Engineering

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Smartsense Consulting Solutions logo
Smartsense Consulting Solutions

Consulting / Business Services

San Francisco

RecommendedJobs for You

Hyderabad, Telangana, India

Hyderabad, Telangana, India

Bengaluru, Karnataka, India

Bengaluru, Karnataka, India