Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
1.0 - 5.0 years
0 Lacs
coimbatore, all india
On-site
As a Vision-Language Model Developer, you will be responsible for developing, fine-tuning, and evaluating vision-language models such as CLIP, Flamingo, BLIP, GPT-4V, LLaVA, etc. You will design and build multimodal pipelines that integrate image/video input with natural language understanding or generation. Working with large-scale image-text datasets like LAION, COCO, and Visual Genome for training and validation will be a key part of your role. Implementing zero-shot/few-shot multimodal inference, retrieval, captioning, VQA (Visual Question Answering), grounding, etc., will also be within your responsibilities. Collaboration with product teams, ML engineers, and data scientists to deliver...
Posted 2 days ago
1.0 - 5.0 years
0 Lacs
coimbatore, tamil nadu
On-site
As a Vision-Language Model Developer, your role involves developing, fine-tuning, and evaluating vision-language models such as CLIP, Flamingo, BLIP, GPT-4V, LLaVA, etc. You will design and build multimodal pipelines that integrate image/video input with natural language understanding or generation. Working with large-scale image-text datasets like LAION, COCO, Visual Genome for training and validation will be part of your responsibilities. You will also implement zero-shot/few-shot multimodal inference, retrieval, captioning, VQA (Visual Question Answering), grounding, etc. Collaboration with product teams, ML engineers, and data scientists is essential to deliver real-world multimodal appl...
Posted 2 months ago
1.0 - 5.0 years
0 Lacs
coimbatore, tamil nadu
On-site
You will be responsible for developing, fine-tuning, and evaluating vision-language models such as CLIP, Flamingo, BLIP, GPT-4V, LLaVA, among others. Your role will involve designing and constructing multimodal pipelines that fuse image/video inputs with natural language comprehension or generation. Working with extensive image-text datasets like LAION, COCO, Visual Genome for training and validation will be a key part of your job. You will also be implementing zero-shot/few-shot multimodal inference, retrieval, captioning, VQA (Visual Question Answering), grounding, etc. It is essential to collaborate closely with product teams, ML engineers, and data scientists to deliver practical multimo...
Posted 3 months ago
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
174558 Jobs | Dublin
Wipro
55192 Jobs | Bengaluru
EY
44116 Jobs | London
Accenture in India
37169 Jobs | Dublin 2
Turing
30851 Jobs | San Francisco
Uplers
30086 Jobs | Ahmedabad
IBM
27225 Jobs | Armonk
Capgemini
23907 Jobs | Paris,France
Accenture services Pvt Ltd
23788 Jobs |
Infosys
23603 Jobs | Bangalore,Karnataka