Norstella - NLP Data Scientist - Generative AI

3 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

About Norstella

At Norstella, our mission is simple : to help our clients bring life-saving therapies to market quickerand help patients in need.Founded in 2022, but with history going back to 1939, Norstella unites best-in-class brands to help clients navigate the complexities at each step of the drug development life cycle and get the right treatments to the right patients at the right time.Each Organization (Citeline, Evaluate, MMIT, Panalgo, The Dedham Group) Delivers Must-have Answers For Critical Strategic And Commercial Decision-making.Together, Via Our Market-leading Brands, We Help Our Clients.Citeline : accelerate the drug development cycle.Evaluate bring the right drugs to market.MMIT identify barrier to patient access.Panalgo turn data into insight faster.The Dedham Group think strategically for specialty therapeutics.By combining the efforts of each organization under Norstella, we can offer an even wider breadth of expertise, cutting-edge data solutions and expert advisory services alongside advanced technologies such as real-world data, machine learning and predictive analytics.As one of the largest global pharma intelligence solution providers, Norstella has a footprint across the globe with teams of experts delivering world class solutions in the USA, UK, The Netherlands, Japan, China and India.

The Role : NLP Data Scientist, AI & Life Sciences :

We are seeking a skilled NLP Data Scientist with a focus on cutting-edge Language Models to join our AI & Life Sciences Solutions team.Your expertise in processing and understanding natural language, paired with your experience in Electronic Health Records (EHR) and clinical data analysis, will be crucial in driving our data science initiatives.You will be instrumental in developing rich, multimodal real-world datasets that will accelerate RWD-driven drug development within the pharmaceutical industry.

Responsibilities

  • Lead the application of advanced NLP and Large Language Models (LLMs), including state-of-the-art open-source models (e.g., Llama3, Mixtral, Gemma) and other foundational models, to extract and interpret complex, unstructured medical data from diverse sources such as EHRs, clinical notes, and laboratory reports.
  • Architect and deploy innovative and scalable NLP solutions that leverage the latest in deep learning to solve complex healthcare challenges, working closely with clinical scientists and data scientists.
  • Design and implement robust data pipelines for cleaning, preprocessing, and validating unstructured data, ensuring the accuracy and reliability of all extracted insights.
  • Develop and optimize prompt engineering strategies for fine-tuning LLMs and enhancing their performance on specialized clinical tasks.
  • Translate complex findings into clear, actionable insights for both technical and non-technical stakeholders, driving data-informed decisions across the organization.

Qualifications

  • Advanced Degree : Master's or Ph.D. in Computer Science, Data Science, Computational Linguistics, Computational Biology, Physics, or a related analytical field.
  • Clinical Data Expertise : Proven experience (3+ years) in handling and interpreting Electronic Health Records (EHRs) and clinical laboratory data.
  • Advanced NLP & Generative AI : Deep experience (3+ years) with modern NLP techniques like semantic search, knowledge graph construction, and few-shot learning.
  • LLM Proficiency : Practical, hands-on experience (2+ years) with fine-tuning, prompt engineering, and inference optimization for LLMs.
  • Technical Stack : Expert proficiency in Python and SQL, with strong experience using Hugging Face Transformers, PyTorch, and/or TensorFlow.
  • Experience in a cloud environment, specifically AWS, with large-scale data systems.
  • MLOps & Workflow Automation : Familiarity with modern MLOps practices (e.g., Git) and a proven track record of developing automated, scalable workflows.
  • Analytical Prowess : A strong analytical mindset with excellent problem-solving skills and a detail-oriented approach to data.
  • Communication : Exceptional verbal and written communication skills with the ability to articulate complex technical findings to a diverse audience.

Preferred Qualifications

  • Healthcare Compliance : Experience managing Protected Health Information (PHI) and a working knowledge of healthcare data privacy laws such as HIPAA.
  • Medical Terminologies : Familiarity with standard healthcare codes and terminologies, including ICD-10, CPT, LOINC, and SNOMED CT.
  • Advanced Retrieval Systems : Practical experience with Retrieval-Augmented Generation (RAG) systems and vector databases for managing and querying large volumes of unstructured medical documents.

Location :

Remote India.(ref:hirist.tech)

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You