On-site
Full Time
We're seeking an experienced Data Engineer to join our team and play a critical role in building
and scaling our next-generation AI-powered marketing personalization platform (V2.0). You'll
architect and implement a sophisticated multi-database infrastructure supporting real-time
personalization, vector search, graph analytics, and large-scale data processing.
This is a greenfield opportunity to design data pipelines from the ground up, working with
cutting-edge technologies including vector databases, graph databases, and large language
models (LLMs). You'll be instrumental in migrating our existing platform while building robust,
scalable data infrastructure that powers AI agents serving thousands of marketing campaigns.
● 5+ years of data engineering experience with production systems
● Expert-level SQL and database design skills
● Strong Python programming (async/await, type hints, testing)
● Experience with at least 3 different database technologies (SQL, NoSQL, Vector,
Graph)
● Proven track record building high-scale data pipelines (>1M records/day)
● Deep understanding of data modeling (dimensional, normalized, denormalized)
● Experience with cloud data warehouses (BigQuery, Redshift, or Snowflake)
● Strong knowledge of data quality, validation, and governance
● Excellent debugging and optimization skills
● Experience with vector databases (Milvus, Pinecone, Weaviate, Qdrant)
● Experience with graph databases (Neo4j, ArangoDB, Neptune)
● Knowledge of embedding models and semantic search
● Experience with ML data pipelines (feature stores, model training data)
● Understanding of A/B testing and experimental design
● Experience with real-time streaming (Kafka, Pub/Sub, Kinesis)
● Knowledge of LLMs and conversational AI systems
● Experience with data migration projects (especially large-scale)
● Background in marketing technology or customer data platforms
● Experience with PyTorch Geometric or graph neural networks
● Knowledge of marketing analytics (attribution, segmentation, personalization)
● Familiarity with LangChain, LangGraph, or agent frameworks
● Experience with cost optimization in cloud environments
● Contributions to open-source data engineering projects
● Experience with data compliance (GDPR, CCPA)
● Design and implement a multi-database architecture (MongoDB, Redis, Milvus, Neo4j, BigQuery)
● Build scalable data pipelines for real-time conversation processing and personalization ● Architect ETL/ELT workflows for data migration from legacy systems ● Implement data partitioning, sharding, and optimization strategies for high-throughput systems
● Design and optimize Milvus vector collections for semantic search (1024-dim embeddings)
● Build graph schemas in Neo4j for customer journey mapping and persona relationships ● Implement HNSW indexing strategies and similarity search optimization ● Create hybrid search systems combining vector, full-text, and graph queries ● Monitor and tune database performance (query latency, throughput, resource utilization)
● Build data collection pipelines for LLM fine-tuning (conversation logs, tool executions) ● Create feature stores for GNN training (customer interactions, engagement signals) ● Implement data versioning and lineage tracking for ML experiments
● Design A/B testing data infrastructure with CUPED variance reduction ● Build real-time feature computation pipelines for contextual bandits
● Design BigQuery schemas for marketing analytics and performance tracking ● Create materialized views and aggregation pipelines for real-time dashboards ● Implement data quality monitoring and anomaly detection
● Build observability infrastructure (Prometheus metrics, Grafana dashboards) ● Develop cost optimization strategies for cloud data warehousing
OWOW
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowhyderabad
7.0 - 9.5 Lacs P.A.
hyderabad
4.0 - 8.0 Lacs P.A.
hyderabad, chennai, bengaluru
2.0 - 5.0 Lacs P.A.
hyderabad
Experience: Not specified
2.0 - 5.0 Lacs P.A.
pune
18.0 - 19.0 Lacs P.A.
bengaluru
4.0 - 8.0 Lacs P.A.
bengaluru
4.0 - 8.0 Lacs P.A.
bengaluru
4.0 - 8.0 Lacs P.A.
pune
4.0 - 8.0 Lacs P.A.
noida
4.0 - 8.0 Lacs P.A.