Senior Data Scientist - NLP

5 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Senior Data Scientist - NLP

Position Overview

Senior-level data scientist role focused on building and deploying production NLP systems on bare metal infrastructure. This position requires a research-oriented mindset with the ability to build first-in-class products by translating cutting-edge research into innovative production solutions.

Required Qualifications

Experience

  • Minimum 5 years in data science/ML engineering roles
  • Minimum 3 years tenure in most recent organization in a relevant data science/ML role
  • Proven track record of deploying ML models to production
  • Experience managing bare metal server infrastructure

Technical Skills

SQL

  • Advanced query optimization and performance tuning
  • Complex joins, window functions, CTEs
  • Experience with Snowflake, BigQuery, or Redshift
  • Database performance analysis and indexing strategies

NLP Technology Stack

  • Transformer architectures
  • RAG pipeline implementation
  • LangChain, LlamaIndex, or similar frameworks
  • Vector databases: Pinecone, Weaviate, Chroma, FAISS
  • Model fine-tuning: LoRA, QLoRA
  • Embedding models and semantic search
  • Prompt engineering techniques

Programming & ML Frameworks

  • Python (advanced level, production-grade code)
  • PyTorch or TensorFlow
  • HuggingFace Transformers
  • scikit-learn, XGBoost, LightGBM

Infrastructure & DevOps

  • Linux system administration
  • Bare metal server management
  • GPU cluster setup and configuration
  • CUDA/cuDNN installation and driver management
  • Multi-GPU distributed training setup
  • Docker and Kubernetes
  • CI/CD pipelines for ML workflows

Production Deployment

  • Model serving: TensorFlow Serving, TorchServe, FastAPI, BentoML
  • MLOps: MLflow, Weights & Biases, Kubeflow
  • Model monitoring and A/B testing
  • Latency optimization and inference scaling

Cloud & Data Engineering

  • AWS, GCP, or Azure
  • Apache Spark, Airflow/Prefect
  • Understanding of on-premise and cloud hybrid architectures

Key Responsibilities

Technical Execution

  • Design and implement production NLP solutions using state-of-the-art language models
  • Build and optimize complex SQL data pipelines processing millions of records
  • Deploy ML models on bare metal GPU infrastructure
  • Configure and maintain GPU clusters for training and inference
  • Implement MLOps practices: versioning, monitoring, automated retraining
  • Optimize model inference for latency and throughput
  • Troubleshoot CUDA, driver, and hardware-level issues
  • Set up distributed training across physical servers
  • Research and prototype emerging ML techniques

Leadership & Strategy

  • Lead end-to-end ML projects from problem definition to production deployment
  • Drive innovation by researching and implementing first-in-class product features
  • Coordinate cross-functional teams including data engineers, domain experts, and full-stack developers to deliver integrated solutions
  • Define technical architecture and design decisions for ML systems
  • Drive adoption of ML best practices and engineering standards across teams
  • Collaborate with product and engineering leadership on ML roadmap and priorities
  • Present technical findings and recommendations to executive stakeholders
  • Own critical ML infrastructure decisions and vendor evaluations
  • Champion innovation by evaluating and integrating cutting-edge ML research
  • Lead cross-functional initiatives between data science, engineering, and product teams
  • Facilitate effective collaboration between technical and non-technical stakeholders
  • Translate latest research papers into production-ready solutions

Team Development

  • Mentor and coach junior data scientists and ML engineers
  • Conduct code reviews and provide technical guidance
  • Develop training materials and knowledge-sharing sessions
  • Participate in hiring and building the data science team
  • Establish coding standards and documentation practices

Required Competencies

  • Research-oriented mindset with ability to innovate and build first-in-class products
  • Ability to work independently with minimal supervision and drive projects autonomously
  • Strong analytical and quantitative aptitude
  • Excellent problem-solving and logical reasoning skills
  • Proven ability to collaborate with cross-functional teams (data engineers, domain experts, full-stack developers)
  • Strong communication skills to translate technical concepts for non-technical stakeholders
  • Willingness to explore uncharted territory and experiment with novel approaches
  • Self-motivated with strong ownership mentality
  • Strong understanding of hardware constraints and optimization
  • Ability to work independently with bare metal infrastructure
  • Experience with both cloud and on-premise deployments
  • Proven ability to take projects from research to production
  • Track record of staying current with ML research and innovations
  • Strong debugging and troubleshooting skills

Evaluation Process

  • SQL optimization and Python coding assessment
  • ML system design interview
  • Technical deep-dive on NLP and production ML
  • Take-home project: end-to-end ML problem

Preferred Qualifications

  • Experience with pre-training multi-modal models (vision-language, audio-text, etc.)
  • Hands-on experience with large-scale distributed training frameworks (DeepSpeed, FSDP, Megatron-LM)
  • Contributions to open source ML projects
  • Technical blog or active GitHub portfolio
  • Experience with model quantization and efficient inference
  • Publications or conference presentations
  • Knowledge of multi-modal architectures (CLIP, Flamingo, GPT-4V style models)

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Meril logo
Meril

Medical Devices

Ahmedabad

RecommendedJobs for You