Data Scientist (Clinical) and Data scientist Bioinformatics

2 years

0 Lacs

Posted:2 months ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Experience Range

CTC:


Role 1:: Data Scientist (Clinical)

We are seeking a highly skilled Data Scientist with strong expertise in clinical data standards and regulatory submission formats. The ideal candidate will have hands on experience working with OMOP and SDTM data models, capable of converting raw, heterogeneous clinical datasets into standardized formats required for regulatory submissions to agencies such as the FDA and PMDA.

Key Responsibilities
  • Convert and standardize clinical datasets originating in OMOP format into SDTM format according to CDISC regulatory submission guidelines. Data harmonization and - 

    must have skill set for shortlist

  • Design the ETL (Extract, Transform, Load) process for 

    clinical datasets

    . -

    must have skill set for shortlist

  • Implement the ETL process using appropriate technical tools and languages .

    must have skill set for shortlist

  • OHDSI recommendation on ETL implementation, experience with SQL builders, Python and R is advantageous. -

    must have skill set for shortlist

  • Organize and clean complex clinical trial data including labs, adverse events, vital signs, and other trial-related datasets into consistent, review-ready tables.- 

    must have skill set for shortlist

  • Working exp on white rabbit -preferred
  • Collaborate closely with clinical operations, statistical programming teams, and regulatory stakeholders to ensure data quality and compliance. 
  • Develop and maintain automated pipelines and workflows for efficient data transformation and validation.
  • Provide expert input on clinical data standards and advise on best practices for data curation and submission readiness.
  • Support regulatory submission processes by ensuring timely and accurate delivery of SDTM datasets.
  • Act as a bridge between raw real-world data sources and regulatory requirements, translating technical challenges into actionable solutions
  • Participate actively in all aspects of quality control to ensure the accuracy and integrity of data throughout the ETL and conversion process.
  • Soft skills: fluent in English, problem-solving mentality, detail-oriented person, good communication and interpersonal skills, fast learner
  • Additional (not must have): previous work/experiences on statistical programming, genomic data, NLP applications on healthcare datasets, data visualization



Role 2:: Data scientist Bioinformatics

We are seeking a highly skilled Data Scientist with strong expertise in clinical data standards, genomic data and variant data for regulatory submission formats. The ideal candidate will have handson experience working with variant databases, OMOP and SDTM data models, capable of converting raw, heterogeneous clinical datasets into standardized formats required for regulatory submissions to agencies such as the FDA and PMDA.

Key Responsibilities:

  • Curate, harmonize and maintain high-quality variant annotation datasets from public and proprietary sources (e.g., ClinVar, ClinGen, HGDM, CADD, refSeq, REVEL gnomAD, dbSNP, COSMIC).

    - - must have

     
  • Knowledge of translational research, precision genomics and RWE studies. -

    must have

  • Data linking of variant data to OMOP and SDTM formats.

    - must have

  • ETL pipeline to map VCF annotation files to OMOP genomic tables.

    - must have

  • Exp of working on OMOP CDM V6+ - preferred.
  • Develop and implement pipelines for harmonizing variant annotations across different formats, nomenclatures, and reference genomes.
  • Standardize variant representations using HGVS, VCF, and other relevant formats.
  • Collaborate with technical operations team to deliver curated data into customer systems
  • Perform quality control and validation of variant annotations to ensure data integrity.
  • Stay current with developments in variant annotation standards and tools and understand differences between annotation database versions
  • Document curation processes and contribute to the documentation

Required Qualifications:

M.S. or Ph.D. in Bioinformatics, Computational Biology, Genomics, or a related field.

  • Strong experience with variant annotation tools and databases.
  • Proficiency in ETL/workflow tools (Airflow, Nextflow) and scripting languages such as Python or R 

    - must have

  • Experience with version control system (Git)
  • Familiarity with genomic data formats (VCF, BED, GFF) and reference genome builds (GRCh37/38).
  • Experience with data harmonization and integration across heterogeneous sources.
  • Knowledge of ontologies and controlled vocabularies (e.g., ClinVar terms, Sequence Ontology).
  • Excellent problem-solving skills and attention to detail.
  • In addition. the following would be an advantage:
  • Experience with SQL (Postgres)
  • Familiarity with kubernetes architecture
  • Familiarity with cloud (AWS)

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Trigent Software Inc logo
Trigent Software Inc

IT Services and IT Consulting

Southborough MA