Senior Data Engineer – Knowledge Graph & AI Platform

5 years

0 Lacs

Posted:6 days ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Location: Remote / Hybrid (India)Employment Type: Full-TimeReporting To: Platform ArchitectRole OverviewThe Senior Data Engineer will build and maintain the core data infrastructure for an enterprise AI platform. This role focuses on designing scalable data pipelines, developing knowledge graphs, and preparing structured and unstructured data for AI and LLM-based applications.Roles & ResponsibilitiesData Pipeline DevelopmentDesign and build scalable data ingestion pipelines from enterprise systems (ERP, documentation tools, version control, and project management tools)Develop connectors for structured, semi-structured, and unstructured dataImplement incremental data loads, change data capture (CDC), and real-time syncEnsure data quality through validation, deduplication, and lineage trackingKnowledge Graph EngineeringDesign ontologies and graph schemas for complex enterprise relationshipsImplement entity resolution and relationship inference across data sourcesBuild APIs and query interfaces for graph traversalOptimize graph storage and query performance for large-scale usageEnterprise Data IntegrationExtract and model enterprise metadata such as business rules and data dictionariesParse and semantically index documents and code artifactsBuild integrations with enterprise APIs and internal platformsAI & LLM Data InfrastructurePrepare structured and contextual data for LLM consumptionDesign embedding strategies and manage vector databases for semantic searchBuild memory and context management systems for stateful AI applications

Required Skills

Core Requirements5+ years of Data Engineering experience with production-grade pipelinesStrong Python skills (clean, testable, maintainable code)MongoDB expertise (schema design, aggregation pipelines, indexing, performance tuning)Vector databases experience (Qdrant, Pinecone, Weaviate, pgvector)Document processing experience (chunking, metadata extraction, PDFs/Word/HTML; LangChain or similar)

Strong SQL Skills (complex Queries, Joins, Window Functions, Optimization)

ETL/ELT at scale (incremental loads, CDC, idempotent pipelines)Pipeline orchestration tools (Airflow, Dagster, Prefect, or similar)Good to Have / Strong PlusExperience building production RAG pipelinesDeep understanding of embedding models and dimensionalityGraph databases (Neo4j) and Cypher query expertiseLLM application development using LangChain or Lang GraphStreaming systems (Kafka, Flink) for real-time pipelinesHybrid search (vector + keyword/metadata filtering)Apache Spark for large-scale transformationsWhat We OfferWork on cutting-edge AI and knowledge graph technologiesBuild foundational infrastructure for an enterprise AI platformCompetitive compensation with equity optionsFlexible remote/hybrid work setupLearning budget and conference supportSkills: pipelines,rag,skills,cdc,vector,databases,metadata,flink,kafka,design,data

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You