Data Scientist – NLP & Conversational AI -3+ Years Gurugram,Haryana,India 3 years None Not disclosed On-site Full Time

Company Description NovaIA offers an AI-powered voice assistant tool designed to support human agents in real time. Particularly tailored for real estate agencies, the assistant can make calls, follow up with leads, filter prospects, and schedule appointments. Key features include real-time agent support and appointment management automation. The assistant listens in on conversations, providing live guidance, data, or suggestions, and seamlessly handles follow-ups and meeting setups through voice interactions. We’re looking for a versatile and hands-on Data Scientist who can bridge the gap between traditional machine learning and conversational AI. You'll work on predictive modeling tasks (e.g., user behavior, conversion forecasting) and also contribute to intelligent voicebots that respond in real-time. This role offers a balance of experimentation, productionization, and product collaboration—ideal for someone who thrives at the intersection of models and applications. Job Title: Data Scientist – Predictive Modeling & Conversational AI Location: Gurgram (On Site) Experience: 3+ years Working Hours: Full time Key Responsibilities Design, build, and evaluate machine learning models for classification, regression, clustering, and ranking use cases Analyze large datasets to extract insights, train predictive models, and improve decision-making Lead and support analytics use cases such as behavior prediction, engagement scoring, and feature engineering Work on NLP/NLU tasks including intent recognition, entity extraction, summarization, and semantic similarity Contribute to conversational AI logic such as dynamic routing, fallback response logic, and session personalization Design evaluation frameworks to assess real-time model performance in voice-based interfaces Collaborate with engineering teams to deploy models in production environments and monitor model health Core Skills ML/DS toolkits: scikit-learn, XGBoost, LightGBM, CatBoost, PyCaret Data wrangling: pandas, NumPy, Polars, SQL (PostgreSQL, BigQuery) NLP frameworks: HuggingFace Transformers, spaCy, NLTK, fastText ML ops understanding: model versioning, performance monitoring, feature store design Working with structured + unstructured data (voice/text/logs) Comfortable writing modular, reusable code in Python or notebooks with best practices Preferred / Bonus Skills LLM integration: prompt engineering, fine-tuning open-source models (e.g., Mistral, LLaMA) Time series forecasting (Prophet, ARIMA, or ML-based) Recommender systems or ranking algorithms (collaborative filtering, hybrid models) Familiarity with RAG pipelines, embeddings, vector search, and hybrid retrieval Experience using experiment tracking tools (MLflow, Weights & Biases, DVC) Exposure to speech/audio data analytics General Qualities We Value Comfort working in fast-paced, ambiguous environments Startup or early product-building experience with cross-functional teams Strong problem-solving ability and interest in building user-facing intelligence Demonstrated portfolio of work (e.g., GitHub, notebooks, blog posts, Kaggle) Curiosity, autonomy, and eagerness to contribute across the stack when needed Note: If Question is Not Applicable: Write NA

Data Engineer - 3+ Years Gurugram,Haryana,India 3 years None Not disclosed On-site Full Time

Company Description NovaIA offers an AI-powered voice assistant tool designed to support human agents in real time. Particularly tailored for real estate agencies, the assistant can make calls, follow up with leads, filter prospects, and schedule appointments. Key features include real-time agent support and appointment management automation. The assistant listens in on conversations, providing live guidance, data, or suggestions, and seamlessly handles follow-ups and meeting setups through voice interactions. We're hiring a Data Engineer to design, implement, and scale robust data pipelines that power our real-time voice-based AI systems. This role involves working with large volumes of structured and unstructured data, enabling low-latency processing across speech-to-text (STT), natural language processing (NLP), and text-to-speech (TTS) modules. You’ll collaborate closely with machine learning engineers, product teams, and DevOps to ensure data availability, reliability, and performance in production environments. Job Title: Data Engineer – Real-Time & ML Pipelines Location: Gurgram (On Site) Experience: 3+ years Working Hours: Full time Key Responsibilities Design and implement data pipelines for real-time STT input, NLP processing, and TTS output Build scalable ingestion systems for audio logs, model artifacts, and interaction metadata Manage message queues and streaming data for efficient voice call routing and response Optimize caching layers and prefetching logic for pre-recorded response fragments Create ETL/ELT workflows for downstream analytics, monitoring, and feedback loops Develop and manage session memory stores for dynamic context handling Ensure data versioning, schema consistency, and lineage tracking Collaborate on token usage optimization and infrastructure cost reporting Core Skills Data pipeline orchestration: Kubernetes Stream processing: Kafka, Apache Flink, Redis Streams, RabbitMQ Programming: Python, SQL; familiarity with Java/Scala is a plus Cloud-native architecture: AWS (Kinesis, S3, Lambda), GCP (Pub/Sub, BigQuery), or Azure equivalents Storage systems: PostgreSQL, DynamoDB, Parquet, Snowflake, Delta Lake Data quality, schema validation, and observability tools Experience working with audio data (transcription logs, metadata tagging, media storage) Version control & CI/CD for data (DVC, Great Expectations, Git) Preferred / Bonus Skills Familiarity with ML model pipelines and experiment tracking Real-time ETL optimization and low-latency microservices Knowledge of vector databases (e.g., FAISS, Chroma, Pinecone) Experience with WebRTC, SIP, or real-time audio systems Data governance and compliance (PII masking, audit trails) General Qualities We Value Comfort working in fast-paced, ambiguous environments Startup or zero-to-one product experience A strong portfolio, GitHub contributions, or project demos Willingness to collaborate closely with founders and cross-functional teams Curiosity, creativity, and ability to learn quickly Note: If Question is Not Applicable: Write NA Note: If Question is Not Applicable: Write NA

Login to

Please Verify Your Phone or Email

Confirm Action

Novaia

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

Novaia