Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 10.0 years
11 - 16 Lacs
bengaluru
Work from Office
Team: Product (Evaluation & Safety) Reports to: Head of Product Role Purpose Define, measure, and validate the safety, accuracy, and usability of our AI scribe and CDS. You bridge clinicians and engineersowning gold standards, bias/safety evaluations, and release gates so models are clinically trustworthy. What Youll Do Evaluation Design & Release Gates Design evaluation protocols for NLP/LLM, CDS, and ASR (accuracy, hallucinations, bias, robustness, latency). Define go/no-go thresholds and own the release gate tied to clinical risk; open crisp remediation tickets. Clinical NER & Code Mapping Evaluation (core) Create gold standards for spans (problem/symptom/med/lab/order/allergy), med attributes , assertion, temporality, and code links (SNOMED, RxNorm, LOINC ICD-10/CPT/local). Run the annotation program with clinicians; achieve 0.80 ; maintain guidelines & QA. Build red-team suites (abbrev collisions, negation, rare meds, overlapping speech, code-switching). Publish scorecards: per-type F1, linking accuracy, attribute exact-match, slice metrics (accent, specialty), and safety outcomes. Benchmarking, Explainability & Reporting Benchmark vs baselines; deliver stakeholder-ready reports & dashboards (confidence, alt candidates, rationale/citations, spancode trace). Provide audit logs suitable for compliance and clinical review. Production Monitoring Monitor drift, bias, and unsafe outputs; analyze clinician overrides and alert fatigue; recommend model/rule tuning. Partner with Compliance for PDPA/HIPAA/ISO 27799 evidence and regulatory submissions. Collaboration Work tightly with the Lead Data & AI Engineer on fixes and regression; align evaluation with real clinical workflows. Required Qualifications 35 years as a Data Scientist / AI Evaluator (healthcare preferred). Strong statistics & experimental design; skilled in error taxonomies and RCA. Python (pandas, NumPy, scikit-learn; basic PyTorch/HF a plus). Clinical ontologies (SNOMED, ICD, RxNorm, LOINC) and EHR data familiarity. Experience running clinician-in-the-loop studies; clear technical + clinical writing. Knowledge of HIPAA/PDPA/ISO 27799. Nice to Have MD/PharmD/Clinical Informaticist background or close clinical research experience. ASR/voice evaluation; diarization/WER analysis. Power analysis, IRB/ethics, risk management frameworks. BI/observability (Metabase/Superset, Grafana, OpenTelemetry). Success Metrics (you own) Quality/Safety : hallucination greater than 1% on audited notes; SOAP completeness greater than 95% . NER per-type F1 0.88 (overall 0.92 ); linking top-1 0.95 ; med attribute exact-match 0.93 . Safety : drugallergy recall greater than 99% , precision greater than 95% ; zero high-severity safety escapes at gate. Program Health : 0.80 ; weekly regression + monthly red-team updates; pilot time-to-finalize 3050% vs baseline.
Posted 1 day ago
5.0 - 10.0 years
11 - 16 Lacs
pune
Work from Office
Team: Product (Evaluation & Safety) Reports to: Head of Product Role Purpose Define, measure, and validate the safety, accuracy, and usability of our AI scribe and CDS. You bridge clinicians and engineersowning gold standards, bias/safety evaluations, and release gates so models are clinically trustworthy. What Youll Do Evaluation Design & Release Gates Design evaluation protocols for NLP/LLM, CDS, and ASR (accuracy, hallucinations, bias, robustness, latency). Define go/no-go thresholds and own the release gate tied to clinical risk; open crisp remediation tickets. Clinical NER & Code Mapping Evaluation (core) Create gold standards for spans (problem/symptom/med/lab/order/allergy), med attributes , assertion, temporality, and code links (SNOMED, RxNorm, LOINC ICD-10/CPT/local). Run the annotation program with clinicians; achieve 0.80 ; maintain guidelines & QA. Build red-team suites (abbrev collisions, negation, rare meds, overlapping speech, code-switching). Publish scorecards: per-type F1, linking accuracy, attribute exact-match, slice metrics (accent, specialty), and safety outcomes. Benchmarking, Explainability & Reporting Benchmark vs baselines; deliver stakeholder-ready reports & dashboards (confidence, alt candidates, rationale/citations, spancode trace). Provide audit logs suitable for compliance and clinical review. Production Monitoring Monitor drift, bias, and unsafe outputs; analyze clinician overrides and alert fatigue; recommend model/rule tuning. Partner with Compliance for PDPA/HIPAA/ISO 27799 evidence and regulatory submissions. Collaboration Work tightly with the Lead Data & AI Engineer on fixes and regression; align evaluation with real clinical workflows. Required Qualifications 35 years as a Data Scientist / AI Evaluator (healthcare preferred). Strong statistics & experimental design; skilled in error taxonomies and RCA. Python (pandas, NumPy, scikit-learn; basic PyTorch/HF a plus). Clinical ontologies (SNOMED, ICD, RxNorm, LOINC) and EHR data familiarity. Experience running clinician-in-the-loop studies; clear technical + clinical writing. Knowledge of HIPAA/PDPA/ISO 27799. Nice to Have MD/PharmD/Clinical Informaticist background or close clinical research experience. ASR/voice evaluation; diarization/WER analysis. Power analysis, IRB/ethics, risk management frameworks. BI/observability (Metabase/Superset, Grafana, OpenTelemetry). .
Posted 1 day ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
64580 Jobs | Dublin
Wipro
25801 Jobs | Bengaluru
Accenture in India
21267 Jobs | Dublin 2
EY
19320 Jobs | London
Uplers
13908 Jobs | Ahmedabad
Bajaj Finserv
13382 Jobs |
IBM
13114 Jobs | Armonk
Accenture services Pvt Ltd
12227 Jobs |
Amazon
12149 Jobs | Seattle,WA
Oracle
11546 Jobs | Redwood City