Data Engineer

3 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Roles and Responsibilities

  • Build and maintain scalable, fault-tolerant

    data pipelines

    to support GenAI and analytics workloads across OCR, documents, and case data.
  • Manage ingestion and transformation of semi-structured legal documents (PDF, Word, Excel) into structured formats.
  • Enable

    RAG workflows

    by processing data into chunked, vectorized formats with metadata.
  • Handle large-scale ingestion from multiple sources into

    cloud-native data lakes

    (S3, GCS),

    data warehouses

    (BigQuery, Snowflake), and PostgreSQL.
  • Automate pipelines using orchestration tools like

    Airflow/Prefect

    , including retry logic, alerting, and metadata tracking.
  • Collaborate with ML Engineers to ensure data availability, traceability, and performance for inference and training pipelines.
  • Implement data validation and testing frameworks using

    Great Expectations

    or

    dbt

    .
  • Integrate OCR pipelines and post-processing outputs for embedding and document search.
  • Design infrastructure for

    streaming vs batch

    data needs and optimize for cost, latency, and reliability.


Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or equivalent.
  • 3+ years of experience in building distributed data pipelines and managing multi-source ingestion.
  • Proficiency with

    Python

    ,

    SQL

    , and data tools like Pandas, PySpark.
  • Experience working with data orchestration tools (Airflow, Prefect), and file formats like Parquet, Avro, JSON.
  • Hands-on experience with cloud storage/data warehouse systems (S3, GCS, BigQuery, Redshift).
  • Understanding of GenAI and vector database ingestion pipelines is a strong plus.
  • Bonus: Experience with OCR tools (Tesseract, Google Document AI), PDF parsing libraries (PyMuPDF), and API-based document processors.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

gurugram, haryana, india

mumbai, maharashtra, india

hyderabad, telangana