Data Engineer : GenAI and Vector Platforms

3 - 7 years

8 - 18 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Work Location : Chennai

Work Mode : Hybrid

Notice: Immediate to 30 days

Experience : 3 - 7 years

If you are interested, please share your resume to

manjula.balaraman@orioninc.com

Role Summary

The GenAI Data Engineer builds and maintains the data pipelines, vector stores, and integrations that power our Generative AI applications. You will ingest and transform structured and unstructured data, implement embeddings and vector indexes, and ensure high data quality and governance for LLM workflows.

Key Responsibilities

Data Ingestion, Cleaning & Transformation

  • Ingest, clean, and transform structured and unstructured data for model training, fine-tuning, and retrieval.
  • Develop ETL/ELT pipelines for text, image, and audio data.
  • Implement document chunking, metadata enrichment, and normalization for RAG and semantic search.

Vector Embeddings & Databases

  • Implement and manage vector embeddings for documents, passages, and entities.
  • Work with vector databases (Pinecone, FAISS, Chroma, Qdrant, Azure AI Search, etc.).
  • Optimize index structures and queries for low-latency similarity search and hybrid search(keyword + vector).

Data Source Integrations

  • Integrate data from Dataverse, Azure Blob/Data Lake, SQL/NoSQL databases, APIs, SharePoint, and other SaaS sources into GenAI systems.
  • Build reusable connectors and ingestion frameworks with proper security and monitoring.
  • Collaborate with GenAI Engineers to design data interfaces that are easy to consume (semantic models, APIs, curated datasets).
  • Implement logging, dashboards, and alerting data workflows and vector indices.
  • Contribute to CI/CD and Infrastructure as Code for data pipelines.

Preferred candidate profile

  • Bachelors / Masterss degree in computer science, Information Systems, Data Engineering, or similar.
  • 3+ years of experience in Data Engineering

    or related roles.
  • Strong proficiency in:

    SQL (complex queries, optimization).

  • Python

    (or Scala) for data processing.
  • Experience with at least one major cloud platform (Azure / AWS / GCP); Azure preferred.
  • Experience building ETL/ELT with tools like Databricks / Spark, Azure Data Factory, Fabric Pipelines, Airflow, or similar.
  • Familiarity with: Handling semi/unstructured data (JSON, Parquet, PDFs, images, audio).
  • Basic concepts of RAG and vector search.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
ORION SYSTEMS logo
ORION SYSTEMS

Information Technology

Tech City

RecommendedJobs for You

noida, new delhi, faridabad