3 years

0 Lacs

Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Location:

Type:

Experience

Salary:


Role Summary

AI Data Engineer

This role requires strong technical skills across Python, automation, ML tooling, and analytical reporting.


Key Responsibilities (Technical)

1. Data Acquisition & Automation

  • Build automated data collection workflows using tools such as

    Firecrawl

    ,

    Playwright

    ,

    Scrapy

    , or similar frameworks
  • Extract multi-format documents (PDFs, HTML, text, images)
  • Handle large-scale crawling, rate limits, error handling, and scheduling


2. Document Processing & Transformation

  • Clean and process unstructured documents
  • Apply OCR (Tesseract, PaddleOCR) for scanned files
  • Convert and structure data using

    PyPDF2

    ,

    pymupdf

    ,

    BeautifulSoup

    , etc.
  • Prepare data in formats such as JSON, JSONL, or CSV


3. Dataset Preparation

  • Segment and structure text for ML training
  • Create Q&A datasets, summaries, instruction-response pairs, and labeled text
  • Build high-quality datasets compatible with fine-tuning frameworks


4. Retrieval & Indexing Pipelines

  • Implement document chunking strategies
  • Generate embeddings and manage vector databases (

    Qdrant

    ,

    Pinecone

    ,

    Weaviate

    )
  • Build retrieval workflows using

    LangChain

    or

    LlamaIndex

  • Optimize retrieval accuracy and latency


5. Model Training & Fine-Tuning

  • Run fine-tuning jobs using

    HuggingFace Transformers

    ,

    LoRA/QLoRA

    , or similar methods
  • Monitor training performance and refine datasets
  • Package and deploy fine-tuned models


6. Data Visualization & Analytics

  • Create analytical charts, trends, and insights using:
  • Pandas

  • Matplotlib

  • Seaborn

  • Plotly

  • Build simple internal dashboards or visual summaries for reports
  • Transform raw datasets into meaningful visual insights


7. Automation & Infrastructure

  • Write modular, maintainable Python scripts
  • Containerize workflows with

    Docker

  • Maintain version control with

    Git

  • Ensure reproducibility and pipeline stability


Required Technical Skills

  • Strong proficiency in

    Python

  • Experience with

    Firecrawl

    , Playwright, Scrapy, or similar tools
  • Strong background in

    document parsing

    , text processing, and OCR
  • Familiarity with

    LangChain

    or

    LlamaIndex

  • Experience with

    vector databases

  • Hands-on experience with

    HuggingFace

    , Transformer models, and fine-tuning
  • Ability to write clean, efficient data pipelines
  • Experience with

    Matplotlib

    ,

    Seaborn

    ,

    Plotly

    , or other visualization tools
  • Comfort using Docker and Git


Nice to Have

  • Experience serving models or building small APIs (FastAPI)
  • Exposure to GPU training environments
  • Background in large-scale unstructured data work
  • Ability to create lightweight dashboards (Plotly Dash, Streamlit)


Ideal Candidate

  • Comfortable owning full pipelines independently
  • Detail-oriented and analytical
  • Strong problem-solving ability
  • Can work with minimal supervision
  • Enjoys building structured systems from scratch

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

pune, maharashtra, india

gurugram, haryana, india

bhopal, madhya pradesh

bhopal, madhya pradesh, india

bengaluru east, karnataka, india

pune, maharashtra, india