How Nabla uses Whisper
Martin Raison
CTO
Sam Humeau
Machine Learning Engineer
In recent days, there has been considerable discussion surrounding Whisper and its tendency to generate hallucinations. At Nabla, we employ Whisper in a way that aligns with our commitment to reliability and accuracy in clinical settings. We want to clarify how we integrate this tool within our proprietary system to uphold the highest standards in medical documentation.
The well-documented limitations of Whisper are exactly why we spent 3 years and $5M, gathered and manually annotated a unique dataset of 7,000 hours of medical encounters audio, and leveraged that dataset to train our own model.
How Nabla ensures integrity of AI-generated content
Our model is based on Whisper, but it contains many improvements specifically developed to suppress hallucinations and make the transcription of medical terms more accurate than any off-the-shelf speech-to-text engine on the market.
In addition, the transcript is not directly included in the patient record. A second layer of processing extracts relevant data from the transcript. Each note produced is split into atomic facts, which are checked via an LLM query against the transcript and the patient context. Only facts for which we find definitive proof are considered valid.
Each new version of our generative model undergoes rigorous evaluation by professional (human) medical scribes to ensure the documentation is both comprehensive and aligned with industry standards.
Some people have suggested we store audio recordings of all encounters, to make it easier to cross-check generated suggestions after the fact. However, based on overwhelming feedback from our community of clinicians, we determined that storing full recordings of medical encounters by default would raise more ethics and privacy problems than it would solve.
Instead, we let clinicians opt into letting us store the audio for later review, on an encounter-by-encounter basis, only if authorized by their corporate policy and after obtaining patient consent.
Additional safeguards & improvement over time
On top of ensuring safety of our AI models themselves, we also take specific steps to ensure models are appropriately used and monitored in the wild:
- Clinician Review: our product includes clear statements reminding users to review and edit notes before exporting. The review of the generated note is done by the same clinician who conducted the medical visit within a short time frame after the visit is over, with the same first-hand memory of the visit that they would have if they were to write the medical note from scratch.
- Proactive feedback tool: users can suggest edits and flag discrepancies through a feedback tool that is integrated in the product. The feedback tool is visible and easy to use, making it simple for users to signal any mistakes. Our product teams continuously monitor the feedback from encounters to ensure the system functions as expected and consistently works to enhance safety and reliability.
- Ongoing Monitoring: Nabla detects incorrectly generated content based on manual edits to the note and plain language feedback. This provides a precise measure of real-world performance and give us additional inputs to improve models over time.
We have processed 9 million medical encounters in the past 24 months and got direct feedback from nearly 10,000 physicians about ways to improve Nabla. While some transcription errors were sometimes reported, hallucination has never been reported as a significant issue.
Why we built Nabla
Nabla is trusted by over 85 healthcare organizations, including some of the Nation’s most renowned institutions, with more than 45,000 clinicians relying on our ambient AI assistant to generate accurate notes and elevate patient care. As a result, 81% of clinicians report improved patient interactions, and burnout symptoms have dropped by an average of 38%, with 90% of clinicians experiencing reduced stress. We are proud to give back precious hours to physicians, allowing them more quality time with patients and loved ones, with many expressing they wouldn’t return to the way they were documenting encounters before Nabla.
We believe deeply in AI’s transformative potential for healthcare and are dedicated to driving this innovation responsibly. Our commitment to protecting patient safety remains unwavering as we advance our technology to support clinicians and enhance patient care, ensuring that every development is guided by integrity, transparency, and trust.