Data Engineer

4 - 7 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Contractual

Job Description

We are seeking a versatile Data Engineer with 4-7 years of experience to own end-to-end data pipelines for downstream AI agent applications. You’ll design data models and transformations, build scalable ETL/ELT workflows, and collaborate on ML model deployment, while learning fast and working on the AI agent space.


Key Responsibilities


Data Modeling & Pipeline development

  • Define and implement logical/physical data models and schemas
  • Develop schema mapping and data dictionary artifacts for cross-system consistency
  • Build, test, and maintain robust ETL/ELT workflows using Spark (batch & streaming)
  • Automate data ingestion from diverse sources (Databases, APIs, files, Sharepoint/document management tools, URLs)


Gen AI Integration

  • Collaborate with AI engineers to enrich data for agentic workflows
  • Instrument pipelines to surface real-time context into LLM prompts


ML Model Deployment Support (Secondary role)

  • Package and deploy ML models (e.g., via MLflow, Docker, or Kubernetes)
  • Integrate inference endpoints into data pipelines for feature serving


Observability & Governance

  • Implement monitoring, alerting, and logging (data quality, latency, errors)
  • Apply access controls and data privacy safeguards (e.g., Unity Catalog, IAM)


CI/CD & Automation

  • Develop automated testing, versioning, and deployment (Azure DevOps, GitHub Actions, Airflow)
  • Maintain reproducible environments with infrastructure as code (Terraform, ARM templates)


Required Skills & Experience


  • 5 years in Data Engineering or similar role, with exposure to ML modeling pipelines
  • Proficiency in

    Python

    ,

    dlt

    for ETL/ELT pipeline,

    duckDB

    for analytical in-process analysis,

    dvc

    for managing large files efficiently.
  • Solid SQL skills and experience designing and scaling relational databases.
  • Familiarity with non-relational column based databases. 
  • Familiarity with Prefect is preferred or others (e.g. Azure Data Factory)
  • Proficiency with the Azure ecosystem. Should have worked on Azure services in production.
  • Familiarity with RAG indexing, chunking and storage across file types for efficient retrieval.
  • Experience deploying ML artifacts using MLflow, Docker, or Kubernetes
  • Strong Dev Ops/Git workflows and CI/CD (CircleCI and Azure DevOps)

  • Bonus skillsets:

    • Prompt Engineering
    • Agent Workflows
    • Experience with Machine Learning and/or Computer Vision
    • Knowledge of data-governance (GDPR, CCPA) and enterprise security patterns


    Obs: We are an early-stage startup, so you are expected to wear many hats, working with things out of your comfort zone, but with real and direct impact in production. If you think you are a good fit for this fast-paced environment, please apply - no direct messages, e-mails will be considered.


    Why us?


    Fast-growing, revenue-generating proptech startup

    Steep learning opportunities in real world enterprise production use-cases

    Remote work with quarterly meet-ups

    Multi-market client exposure

    Mock Interview

    Practice Video Interview with JobPe AI

    Start DevOps Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Skills

    Practice coding challenges to boost your skills

    Start Practicing Now

    RecommendedJobs for You

    hyderabad, pune, bengaluru