Vision-Language Models and Generative AI (GenAI)

8 - 10 years

0 Lacs

Posted:1 week ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Roles & Responsibilities:

Conduct deep research in:

  • Vision-Language and Multimodal AI for perception and semantic grounding
  • Cross-modal representation learning for real-world sensor fusion (camera, lidar, radar, text)
  • Multimodal generative models for scene prediction, intent inference, or simulation
  • Efficient model architectures for edge deployment in automotive and factory systems
  • Evaluation methods for explain ability, alignment, and safety of VLMs in mission-critical applications
  • Spin newer research directions and drive AI research programs for autonomous driving, ADAS, and Industry 4.0 applications.
  • Create new collaborations within and outside of Bosch in relevant domains.
  • Contribute to Bosch's internal knowledge base, open research assets, and patent portfolio.
  • Lead internal research clusters or thematic initiatives across autonomous systems or industrial AI.
  • Mentor and guide research associates, interns, and young scientists.

Qualifications

Educational qualification:

Ph.D. in Computer Science / Machine Learning / AI / Computer Vision or equivalent

Experience:

8+ years (post PhD) in AI related to Vision and Language modalities, excellent exposure and hands on research in GenAI, VLMs, Multimodal AI, or Applied AI Research.

Mandatory/requires Skills:

Deep expertise in:

  • Vision-Language Models (CLIP, Flamingo, Kosmos, BLIP, GIT) and multimodal transformers
  • Open- and closed-source LLMs (e.g., LLaMA, GPT, Claude, Gemini) with visual grounding extensions
  • Contrastive learning, cross-modal fusion, and structured generative outputs (e.g., scene graphs)
  • PyTorch, HuggingFace, OpenCLIP, and deep learning stack for computer vision
  • Evaluation on ADAS/mobility benchmarks (e.g., nuScenes, BDD100k) and industrial datasets
  • Strong track record of publications in relevant AI/ML/vision venues
  • Demonstrated capability to lead independent research programs
  • Familiarity with multi-agent architectures, RLHF, and goal-conditioned VLMs for autonomous agents

Preferred Skills:

Hands-on experience with:

  • Perception stacks for ADAS, SLAM, or autonomous robots
  • Vision pipeline tools (MMDetection, Detectron2, YOLOv8) and video understanding models
  • Semantic segmentation, depth estimation, 3D vision, and temporal models
  • Industrial datasets and tasks: defect detection, visual inspection, operator assistance
  • Lightweight or compressed VLMs for embedded hardware (e.g., in vehicle ECUs or factory edge)
  • Knowledge of reinforcement learning or planning in embodied AI context
  • Strong academic or industry research collaborations
  • Understanding of Bosch domains and workflows in mobility and manufacturing

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Bosch Global Software Technologies logo
Bosch Global Software Technologies

Software Development

Bangalore Karnataka

RecommendedJobs for You