AI Evaluation & Clinical Data Scientist

5 - 10 years

11 - 16 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Team: Product (Evaluation & Safety)

Reports to: Head of Product

Role Purpose

Define, measure, and validate the safety, accuracy, and usability of our AI scribe and CDS. You bridge clinicians and engineersowning gold standards, bias/safety evaluations, and release gates so models are clinically trustworthy.

What Youll Do

Evaluation Design & Release Gates

  1. Design evaluation protocols for NLP/LLM, CDS, and ASR (accuracy, hallucinations, bias, robustness, latency).
  1. Define go/no-go thresholds and own the release gate tied to clinical risk; open crisp remediation tickets.

Clinical NER & Code Mapping Evaluation (core)

  1. Create gold standards for spans (problem/symptom/med/lab/order/allergy), med attributes, assertion, temporality, and code links (SNOMED, RxNorm, LOINC ICD-10/CPT/local).
  1. Run the annotation program with clinicians; achieve 0.80; maintain guidelines & QA.
  1. Build red-team suites (abbrev collisions, negation, rare meds, overlapping speech, code-switching).
  1. Publish scorecards: per-type F1, linking accuracy, attribute exact-match, slice metrics (accent, specialty), and safety outcomes.

Benchmarking, Explainability & Reporting

  1. Benchmark vs baselines; deliver stakeholder-ready reports & dashboards (confidence, alt candidates, rationale/citations, spancode trace).
  1. Provide audit logs suitable for compliance and clinical review.

Production Monitoring

  1. Monitor drift, bias, and unsafe outputs; analyze clinician overrides and alert fatigue; recommend model/rule tuning.
  1. Partner with Compliance for PDPA/HIPAA/ISO 27799 evidence and regulatory submissions.

Collaboration

  1. Work tightly with the Lead Data & AI Engineer on fixes and regression; align evaluation with real clinical workflows.

Required Qualifications

  1. 35 years as a Data Scientist / AI Evaluator (healthcare preferred).
  1. Strong statistics & experimental design; skilled in error taxonomies and RCA.
  1. Python (pandas, NumPy, scikit-learn; basic PyTorch/HF a plus).
  1. Clinical ontologies (SNOMED, ICD, RxNorm, LOINC) and EHR data familiarity.
  1. Experience running clinician-in-the-loop studies; clear technical + clinical writing.
  1. Knowledge of HIPAA/PDPA/ISO 27799.

Nice to Have

  1. MD/PharmD/Clinical Informaticist background or close clinical research experience.
  1. ASR/voice evaluation; diarization/WER analysis.
  1. Power analysis, IRB/ethics, risk management frameworks.
  1. BI/observability (Metabase/Superset, Grafana, OpenTelemetry).

Success Metrics (you own)

  1. Quality/Safety: hallucination greater than 1% on audited notes; SOAP completeness greater than 95%.
    1. NER per-type F1 0.88 (overall 0.92); linking top-1 0.95; med attribute exact-match 0.93.
    1. Safety: drugallergy recall greater than 99%, precision greater than 95%; zero high-severity safety escapes at gate.
  2. Program Health: 0.80; weekly regression + monthly red-team updates; pilot time-to-finalize 3050% vs baseline.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You