Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
3.0 - 7.0 years
0 Lacs
hyderabad, telangana
On-site
As an experienced LLM Engineer, you will be responsible for leading the deployment and optimization of self-hosted Large Language Models (LLMs). Your expertise in deploying, fine-tuning, and optimizing open-source models like LLaMA, Mistral, and Falcon on on-premise GPU servers will be crucial for this role. Your key responsibilities will include deploying and optimizing self-hosted LLMs for low-latency, high-efficiency inference, setting up GPU-accelerated servers for AI workloads using CUDA, TensorRT, and vLLM, implementing model quantization for efficient memory usage, developing APIs for model inference, automating model deployment, fine-tuning and training models, monitoring system performance, and working with cross-functional teams to integrate LLMs into applications. To excel in this role, you should have at least 3 years of experience in AI/ML model deployment, a strong knowledge of Python, PyTorch, and TensorFlow, hands-on experience with LLM inference frameworks, experience with NVIDIA GPU acceleration, proficiency in Linux, Docker, and Kubernetes, knowledge of security best practices for AI deployments, and experience with distributed computing and API development. Preferred qualifications include experience with on-premise AI infrastructure, familiarity with vector databases for RAG applications, and contributions to open-source AI/ML projects. This is an exciting opportunity for a skilled professional to make a significant impact in the field of Large Language Models and AI deployment.,
Posted 2 weeks ago
1.0 - 5.0 years
0 Lacs
hyderabad, telangana
On-site
You will be working full-time at Soothsayer Analytics in Hyderabad as a Generative AI/LLM Engineer. Your primary responsibility will be to design, develop, and deploy generative AI models using cutting-edge technologies like Azure OpenAI GPT-4, GPT-4 Vision, or GPT-4 Omni. You should have a strong background in building and deploying AI models, with a focus on leveraging technologies such as Retrieval-Augmented Generation (RAG) and working with Vector Databases. While experience in fine-tuning large language models (LLMs) is beneficial, it is not mandatory. You are expected to have a general understanding of training or fine-tuning deep learning models and be able to quickly learn and implement advanced techniques. Your key responsibilities will include designing, developing, and deploying generative AI models using GPT-4 variants, implementing and optimizing RAG techniques, building and managing AI services using Python frameworks like LangChain or LlamaIndex, and developing APIs with FastAPI or Quart for efficient integration. You will focus on scalability, performance, and optimization of AI solutions across cloud environments, particularly with Azure and AWS. Working with Vector Databases is mandatory, and experience with Graph Databases is optional. You will also utilize Cosmos DB and SQL for data storage and management solutions, apply MLOps or LLMOps practices, and manage Azure Pipelines for continuous integration and deployment. Additionally, you will continuously research and adopt the latest advancements in AI technologies. To qualify for this role, you should have a Bachelor's or Master's degree in Computer Science, AI, Data Science, or a related field, with at least 1+ years of experience in Generative AI/LLM technologies and 5+ years of experience in related fields. Proficiency in Python and experience with frameworks like LangChain, LlamaIndex, FastAPI, or Quart is required. Expertise in RAG and experience with Vector Databases are mandatory. Knowledge of Cosmos DB and SQL is also essential. Experience with fine-tuning LLMs and Graph Databases is beneficial but not mandatory. You should have proven experience in MLOps, LLMOps, or DevOps, a strong understanding of CI/CD processes, automation, and pipeline management, and familiarity with containers, Docker, or Kubernetes. Experience with cloud platforms, particularly Azure or AWS, and cloud-native AI services is desirable. Strong problem-solving abilities and a proactive approach to learning new AI trends and best practices quickly are necessary for this role.,
Posted 1 month ago
6.0 - 9.0 years
10 - 20 Lacs
Hyderabad
Work from Office
Note: 1. Immediate to 30 days serving notice period 2.Who are available for face to face and video can apply Please add more profile for LLM engineer for weekend drive, below is the mandatory skills which delivery is looking for: 5+ years of relevant experience in Python , AI and machine learning - 2+ years of relevant experience in Gen AI LLM Hands-on experience with at least 1 end-to-end GenAI project Worked with LLMs such as GPT, Gemini, Claude, LLaMA, etc LLM skills: RAG, LangChain, Transformers, TensorFlow, PyTorch, spaCy Experience with REST API integration (e.g. FastAPI, Flask) Proficient in prompt types: zero-shot, few-shot, chain-of-thought - Knowledge of model training, fine-tuning, and deployment workflows LLMOPs - atleast 1 cloud (AZURE/AWS) , GITHUB , Docker/Kubernetes , CICD Pipeline - Comfortable with embedding models and vector databases (e.g. FAISS, Pinecone)
Posted 1 month ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |