Assessing the reliability of Nabla’s speech-to-text engine
Martin Raison
CTO
Sam Humeau
Machine Learning Engineer
Following recent discussions around our use of Whisper, we’d like to share more specific results confirming that our model remains immune to the specific issues observed in Whisper in the recent study from Professors Allison Koenecke of Cornell University and Mona Sloane of the University of Virginia.
Methodology
Nabla conducted an in-depth analysis, specifically targeting the transcription hallucination issue described in the study. We meticulously reviewed the 187 audio samples from the study that led Whisper to produce inaccurate or fabricated text. By running these same samples through our proprietary Nabla speech-to-text model, we sought to verify whether any hallucinations known to occur in Whisper were present in our model as well.
Conclusions
Out of the 187 audio samples from the study, our proprietary model did not produce a single hallucination. This was determined by humans manually reviewing the outputs of our model and comparing them with the ground truth transcripts provided in the study, with three passes of examination performed by different reviewers.
This demonstrates that one cannot use Whisper’s weaknesses to assess Nabla’s performance. Whisper is only used as a baseline model to train our own speech-to-text, using proprietary methods and a unique dataset of 7,000 hours of medical audio. Moreover, Nabla incorporates multiple additional safeguards in the product itself, and in particular the raw output of the speech-to-text model is never incorporated into patient records.
We are working to publish the code we used for this experiment and will update this post with additional information.
Ongoing Commitment to Trust and Transparency in AI-assisted healthcare
Nabla is trusted by more than 85 healthcare institutions, and over 45,000 clinicians rely on our ambient AI assistant to streamline documentation and improve patient care. We’re proud to be delivering on our promise: reducing clinician stress, minimizing burnout and improving patient interactions. These findings underscore our dedication to integrity, transparency, and the continuous improvement of our technology to meet clinicians’ evolving needs.
At Nabla, we believe in AI’s transformative power for healthcare. As we advance, our commitment to protecting patient safety and upholding accuracy in medical documentation remains unwavering.