Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
7.0 - 12.0 years
8 - 14 Lacs
Bengaluru, Karnataka, India
On-site
We are seeking a skilled and experienced Platform Engineer/Architect to lead the setup, advancement and maintenance of a robust on-premise environment for hosting open-source large language models. This role involves designing and implementing scalable, secure, and efficient infrastructure solutions that cater to the specific needs of large-scale AI models. HOW YOU WILL CONTRIBUTE AND WHAT YOU WILL LEARN Design and architect a scalable and secure on-premise hosting environment for large language models. Develop and implement infrastructure automation tools for efficient management and deployment. Ensure high availability and disaster recovery capabilities. Optimize the hosting environment for maximum performance and efficiency. Implement monitoring tools to track system performance and resource utilization. Regularly update the infrastructure to incorporate the latest technological advancements. Establish robust security protocols to protect sensitive data and model integrity. Ensure compliance with data protection regulations and industry standards. Conduct regular security audits and vulnerability assessments. Work closely with AI/ML teams to understand their requirements and provide suitable infrastructure solutions. Provide technical guidance and support to internal teams and stakeholders. Stay abreast of emerging trends in AI infrastructure and large language model hosting. Manage physical and virtual resources to ensure optimal allocation and utilization. Forecast resource needs and plan for future expansion and upgrades KEY SKILLS AND EXPERIENCE Bachelor's or Master's degree in Computer Science, Information Technology, or a related field with 7-12 years of experience. Proven experience in infrastructure architecture, with exposure to AI/ML environments. Experience with inferencing frameworks like TGI, TEI, Lorax, S-Lora etc. Experience with training frameworks like PyTorch, TensorFlow etc. Proven experience with On-premises OSS models Llama3, Mistral etc. Strong knowledge of networking, storage, and computing technologies. Experience of working with container orchestration tools (e.g., Kubernetes - Redhat OS). Proficient programming skills in Python Familiarity with open-source large language models and their hosting requirements. Excellent problem-solving and analytical skills. Strong communication and collaboration abilities.
Posted 6 days ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough