Techfabric Digital Solutions India

2 Job openings at Techfabric Digital Solutions India
Senior Data Scientist - Information Retrieval & Generative AI hyderabad 5 - 10 years INR 25.0 - 40.0 Lacs P.A. Remote Full Time

Senior Data Scientist - Information Retrieval & Generative AI Location: Remote (India) | Type: Full-time | Experience: 5-7+ years Transform How the World Discovers Information Join our AI revolution in transforming how billions discover and interact with information. As a Senior Data Scientist, youll architect next-generation search and retrieval systems that power intelligent experiences across our platform, directly impacting millions of users worldwide. The Challenge Design and deploy state-of-the-art RAG architectures processing petabyte-scale datasets. Build hybrid dense/sparse retrieval pipelines that serve millions of daily queries with sub- second latency. Your models will directly impact product strategy and drive measurable business outcomes for our global user base. • What Youll Do Strategic Leadership Drive end-to-end ML product development from research to production deployment Collaborate with engineering and product teams to translate business requirements into scalable data solutions Mentor junior data scientists and establish best practices for model development Lead breakthrough research in information retrieval and generative AI Technical Execution Design and optimize transformer-based architectures for information retrieval and generation Implement advanced chunking strategies for semantic search and RAG applications Build and maintain real-time ML pipelines processing millions of documents Develop production-ready models with proper monitoring, versioning, and deployment strategies Innovation & Research Research and prototype cutting-edge AI techniques in search, retrieval, and natural language processing Design large-scale experiments and A/B tests to validate model performance and business impact Stay current with latest developments in GenAI and contribute to open-source communities Present findings to executive leadership and influence strategic product decisions What Were Looking For Essential Skills ¢ ¢ ¢ ¢ ¢ Advanced Python Programming with expertise in pandas, scikit-learn, TensorFlow/PyTorch SQL & Database Management for complex query optimization and data pipeline design Machine Learning & Deep Learning with track record of shipping ML products to production Statistics & Probability including advanced statistical modeling and hypothesis testing 5-7+ years of data science experience with 2+ years in senior roles Specialized Expertise ¢ Information Retrieval Systems - Search algorithms, ranking, and relevance optimization ¢ ¢ Generative AI & LLMs - Prompt engineering, fine-tuning, and deployment at scale Content Chunking Strategies - Document processing and semantic segmentation for RAG systems ¢ ¢ Vector Databases - Hands-on experience with Pinecone, Weaviate, FAISS, or OpenSearch Transformer Models - Deep understanding of BERT, GPT, T5 architectures Advanced Technical Skills ¢ ¢ ¢ ¢ ¢ RAG (Retrieval-Augmented Generation) implementation and optimization Named Entity Recognition (NER) at enterprise scale Cloud platforms (AWS, Azure, GCP) for ML deployment MLOps tools and practices (Docker, Kubernetes, model registries) A/B testing and experimental design methodology Education & Experience ¢ Masters degree in Computer Science, Statistics, Mathematics, or related quantitative field ¢ ¢ Demonstrated experience with production ML systems serving millions of users Strong publication record or open-source contributions (preferred) Why Join Us? ¢ ¢ ¢ ¢ Cutting-edge Technology: Work with the latest in AI/ML, from transformer architectures to vector databases Global Impact: Your work will be used by millions of users across different continents Career Growth: Clear advancement paths with mentorship and leadership opportunities Innovation Freedom: 20% time for personal research projects and experimentation Our Interview Process 1. Recruiter Screen (30 min) - Background, motivation, and culture fit 2. Technical Screen (60 min) - Live coding in Python/SQL, ML fundamentals 3. Technical Deep Dive (90 min) - Advanced ML/AI questions, system design Ready to Shape the Future of AI? If youre passionate about pushing the boundaries of information retrieval and generative AI, we want to hear from you. Join a team where your expertise will drive innovation and create meaningful impact on a global scale. Apply now and help us build the next generation of intelligent search and discovery systems. Send in your resume to Mr.Praveen at Praveen.kunta@techfabric.com We are an equal opportunity employer committed to diversity and inclusion. We welcome applications from all qualified candidates regardless of race, gender, age, religion, sexual orientation, or disability status. Application Requirements: - Resume/CV highlighting relevant ML/AI experience - Cover letter explaining your interest in information retrieval and generative AI - Links to relevant projects, publications, or GitHub repositories (preferred) - Portfolio demonstrating production ML systems or research contributions

Data Engineer - Databricks Specialist hyderabad 5 - 10 years INR 25.0 - 35.0 Lacs P.A. Remote Full Time

Role & responsibilities : Data Pipeline Development & Management Design and implement robust, scalable ETL/ELT pipelines using Databricks and Apache Spark Process large volumes of structured and unstructured data Develop and maintain data workflows using Databricks workflows, Apache Airflow, or similar orchestration tools Optimize data processing jobs for performance, cost efficiency, and reliability Implement incremental data processing patterns and change data capture (CDC) mechanisms Databricks Platform Engineering Build and maintain Delta Lake tables and implement medallion architecture (bronze, silver, gold layers) Develop streaming data pipelines using Structured Streaming and Delta Live Tables Manage and optimize Databricks clusters for various workloads Implement Unity Catalog for data governance, security, and metadata management Configure and maintain Databricks workspace environments across development, staging, and production Data Architecture & Modeling Design and implement data models optimized for analytical workloads Create and maintain data warehouses and data lakes on cloud platforms (Azure, AWS, or GCP) Implement data partitioning, indexing, and caching strategies for optimal query performance Collaborate with data architects to establish best practices for data storage and retrieval patterns Performance Optimization & Monitoring Monitor and troubleshoot data pipeline performance issues Optimize Spark jobs through proper partitioning, caching, and broadcast strategies Implement data quality checks and automated testing frameworks Manage cost optimization through efficient resource utilization and cluster management Establish monitoring and alerting systems for data pipeline health and performance Collaboration & Best Practices Work closely with data scientists, analysts, and business stakeholders to understand data requirements Implement version control using Git and follow CI/CD best practices for code deployment Document data pipelines, data flows, and technical specifications Mentor junior engineers on Databricks and data engineering best practices Participate in code reviews and contribute to establishing team standards Preferred candidate profile 5+ years of experience in data engineering with hands-on Databricks experience Strong proficiency in Python and/or Scala for Spark application development Expert-level knowledge of Apache Spark, including Spark SQL, DataFrames, and RDDs Deep understanding of Delta Lake and Lakehouse architecture concepts Experience with SQL and database optimization techniques Solid understanding of distributed computing concepts and data processing frameworks Proficiency with cloud platforms (Azure, AWS, or GCP) and their data services Experience with data orchestration tools (Databricks Workflows, Apache Airflow, Azure Data Factory) Knowledge of data modeling concepts for both OLTP and OLAP systems Familiarity with data governance principles and tools like Unity Catalog Understanding of streaming data processing and real-time analytics Experience with version control systems (Git) and CI/CD pipelines Preferred Qualifications Databricks Certified Data Engineer certification (Associate or Professional) Experience with machine learning pipelines and MLOps on Databricks Knowledge of data visualization tools (Power BI, Tableau, Looker) Experience with infrastructure as code (Terraform, CloudFormation) Familiarity with containerization technologies (Docker, Kubernetes)