Data Scientist

0 years

17 - 0 Lacs

Posted:10 hours ago| Platform: SimplyHired logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

DS (Vector Search + GCP )- Bangalore

Bangalore

JOB DESCRIPTION

Data/Applied scientist (Search)

  • Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

  • Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

  • Must have: Hands-on experience with GCP services for ML & data science
  • Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
  • Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

  • Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

  • Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
  • Must have: Understanding of semantic vs lexical search paradigms
  • Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

  • Should be proficient in SQL and BigQuery for analytics and feature generation
  • Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

  • Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

  • Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

  • Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
  • Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
  • Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

  • Should understand how to build end-to-end ML pipelines for search and ranking applications
  • Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

  • Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

EXPERTISE AND QUALIFICATIONS

Data/Applied scientist (Search)

  • Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

  • Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

  • Must have: Hands-on experience with GCP services for ML & data science
  • Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
  • Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

  • Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

  • Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
  • Must have: Understanding of semantic vs lexical search paradigms
  • Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

  • Should be proficient in SQL and BigQuery for analytics and feature generation
  • Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

  • Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

  • Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

  • Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
  • Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
  • Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

  • Should understand how to build end-to-end ML pipelines for search and ranking applications
  • Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

  • Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

Job Type: Full-time

Pay: Up to ₹1,700,000.00 per year

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

bengaluru, karnataka, india