Data & AI Engineer (Data Pipelines & RAG)

4 - 7 years

0 Lacs

Posted:4 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Data & AI Engineer (Data Pipelines & RAG)


We are seeking a versatile Data & AI Engineer with 4-7 years of experience to build, deploy & maintain end-to-end data pipelines for downstream Gen AI applications. You’ll design data models and transformations, build scalable ETL/ELT workflows, while learning fast and working on the AI agent space.


Key Responsibilities


Data Modeling & Pipeline development


  • Automate data ingestion from diverse sources (Databases, APIs, files, Sharepoint/ document management tools, URLs). Most files are expected to be unstructured documents with different file formats, tables, charts, process flows, schedules, construction layouts/drawings, etc.
  • Own chunking strategy, embedding, indexing all unstructured & structured data for efficient retrieval by downstream RAG/agent systems
  • Build, test, and maintain robust ETL/ELT workflows using Spark (batch & streaming)
  • Define and implement logical/physical data models and schemas. Develop schema mapping and data dictionary artifacts for cross-system consistency


Gen AI Integration

  • Instrument data pipelines to surface real-time context into LLM prompts
  • Implement prompt engineering and RAG for varied workflows within the RE/Construction industry vertical


Observability & Governance

  • Implement monitoring, alerting, and logging (data quality, latency, errors)
  • Apply access controls and data privacy safeguards (e.g., Unity Catalog, IAM)


CI/CD & Automation

  • Develop automated testing, versioning, and deployment (Azure DevOps, GitHub Actions, Prefect/Airflow)
  • Maintain reproducible environments with infrastructure as code (Terraform, ARM templates)


Required Skills & Experience

  • 5 years in Data Engineering or similar role, with

    at least 12-24 months

    of exposure to building pipelines for unstructured data extraction including document processing with OCR, cloud-native solutions and chunking, indexing etc. for downstream consumption by RAG/ Gen AI applications.
  • Proficiency in Python, dlt for ETL/ELT pipeline, duckDB or equivalent tools for analytical in-process analysis, dvc for managing large files efficiently.
  • Solid SQL skills and experience designing and scaling relational databases. Familiarity with non-relational column based databases is preferred.
  • Familiarity with Prefect is preferred or others (e.g. Azure Data Factory)
  • Proficiency with the Azure ecosystem. Should have worked on Azure services in production.
  • Familiarity with RAG indexing, chunking and storage across file types for efficient retrieval.
  • Strong Dev Ops/Git workflows and CI/CD (CircleCI / Azure DevOps)
  • Experience deploying ML artifacts using MLflow, Docker, or Kubernetes is good to have.

  • Bonus skillsets:


    • Experience with Computer vision based extraction or experience in building ML models for production

    • Knowledge of agentic AI system design - memory, tools, context, orchestration
    • Knowledge of data governance, privacy laws (GDPR) and enterprise security patterns


    We are an early-stage startup, so you are expected to wear many hats, working with things out of your comfort zone, but with real and direct impact in production. If you think you are a good fit for this fast-paced environment, please apply.

    Mock Interview

    Practice Video Interview with JobPe AI

    Start DevOps Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now

    RecommendedJobs for You