Evaluation and Management (E/M) coding is one of the most familiar and most frustrating parts of clinical practice.
Most clinicians understand, at a high level, what level a visit deserves. A complex patient with multiple conditions, medication changes, and care coordination likely warrants a Level 4 or 5. A straightforward visit does not.
But assigning a higher code requires something specific: documentation that clearly supports it. Inaccurate coding has real downstream consequences. Under-coding undervalues the care delivered and leaves complexity unrecognized, while denied codes create rework, delay reimbursement, and add operational burden across clinical and billing teams.
And that is where the system breaks down.
Under-coding is often framed as a knowledge problem. In reality, it is more often a documentation and confidence problem. When notes are incomplete or rushed, clinicians are less certain that what they wrote fully reflects the care they delivered. So they hedge. They choose the safer, lower code.
Not because the visit was simpler, but because the note does not give them enough confidence or justification to do otherwise.
What the data shows: documentation and coding are tightly linked
This dynamic is increasingly visible in emerging research on AI-assisted documentation.
A recent study published in JAMA Network Open found that clinicians using AI scribes generated 0.04 more RVUs per encounter and 1.81 more RVUs per week, with no increase in claim denials.
That last point is critical.
If coding intensity increases without a corresponding rise in denials, it suggests something important:this is not upcoding. It is more accurate coding enabled by better documentation.
The mechanism is straightforward:
- When notes are more complete, they better reflect the complexity of the visit
- When clinicians can see that complexity clearly documented, they are more confident assigning the appropriate code
- When codes align with care delivered, both clinical and financial outcomes improve
In other words, the note does not just record the visit. It determines how confidently that visit can be coded.
What happens in practice: University of Toledo Health
This same pattern appeared in real-world data from an 8-week pilot at University of Toledo Health.
During the pilot, clinicians used Nabla’s ambient documentation. Importantly, no coding assistant was in place.
And yet, coding patterns shifted:
- Established patient Level 4 visits increased from 41% to 45%
- Established patient Level 5 visits increased from 3% to 5%
These are not marginal changes. They reflect a meaningful shift in coding intensity across a large volume of visits.
Just as importantly, the shift happened without changing coding rules, training, or guidance.
The only variable that changed was documentation quality.
This is the key insight: better documentation did not just make notes more complete. It made clinicians more confident that their documentation supported the care they delivered.
The mechanism: from documentation to confidence to accuracy
The relationship between documentation and coding is often treated as linear.
In practice, it is behavioral.
When documentation is burdensome:
- Notes are shorter, less structured, and less complete
- Clinicians are unsure what their documentation supports
- Coding decisions become conservative
Over time, this compounds. Clinicians begin to expect that their notes will not support higher-complexity codes, even when the visit warrants them.
Ambient AI changes that dynamic.
By reducing the effort required to produce a complete, structured note, it does two things at once:
- Improves the fidelity of the documentation
- Restores clinician confidence in what the note supports
That second point is often overlooked.
The note does not just inform the code. It gives the clinician permission and confidence to assign it.
At the visit level
The numbers tell one part of the story. The clinical experience explains the rest.
As Dr. Matt Sakumoto, Chief Clinical Product Officer at Nabla, described:
“I didn’t go to medical school and residency to memorize a set of five-digit codes.
Even something as simple as a cough visit can map to multiple codes depending on context, whether the patient is new or returning, virtual or in person. And defining ‘complexity’ is not always straightforward. The guidelines have evolved, but applying them consistently in a busy clinic is still challenging.
I remember a new patient I saw in urgent care, a young woman with joint pain and fatigue. She had no prior records, so most of the visit was spent understanding her symptoms. I didn’t order tests or referrals because I wanted her to establish with a primary care physician for continuity. The visit was about 20 minutes, so billing by time didn’t make sense.
At the time, it felt like a straightforward, low complexity visit. I likely would have coded it as a Level 3.
But when I reviewed the documentation with Nabla, it told a different story. The note captured her social determinants of health, which increased her risk, along with the diagnostic uncertainty around her symptoms. That placed the visit into moderate complexity.
Seeing that clearly documented changed how I thought about the code. It gave me the confidence to assign the level the visit actually warranted.”
This is the shift the data reflects.
Not a change in coding behavior in isolation, but a change in how clinicians interpret and trust their own documentation.
The bottom line
Under-coding is not primarily a coding problem. It is a documentation problem.
When notes are incomplete, clinicians code conservatively.
When notes are complete, clinicians code accurately.
The data from both published research and real-world pilots points in the same direction:
Better documentation leads to more confident clinicians, leading to more accurate coding and improved outcomes.
Better documentation removes the guesswork from coding.




