AI Scribing, Coding and SNOMED: Why Structured Documentation Still Needs Clinical Verification

Featured image for AI Scribing, Coding and SNOMED: Why Structured Documentation Still Needs Clinical Verification

AI scribes are no longer just turning speech into text. The next generation produces summaries, letters, structured clinical fields, and — increasingly — SNOMED CT coding suggestions. When Accurx Scribe writes back into EMIS with a suggested SNOMED code, or Tortus generates a coded clinical entry in SystmOne, the scribe has moved from documentation into the structured data layer of the permanent clinical record.

That transition — from narrative text to structured code — is where the clinical safety risk shifts from "is the note readable?" to "is the code correct?" Because a wrong code does not just describe the consultation inaccurately. It triggers automated consequences that may persist for decades.

AI Scribes Are Moving Beyond Transcription

First-generation scribes transcribed speech into text. Second-generation scribes summarise, format, and structure — producing clinical notes with presenting complaint, history, examination, assessment, and plan sections. Third-generation scribes are entering the coding layer: suggesting SNOMED CT concepts, generating problem list entries, mapping clinical language to diagnostic codes, and potentially updating disease registers and recall systems automatically.

This evolution is clinically useful. Manual coding is time-consuming, inconsistent, and frequently deprioritised under time pressure — many GPs acknowledge that coding quality suffers during busy clinics. Automated coding suggestions could improve consistency and completeness across the clinical record. But the evolution introduces a category of risk that transcription alone does not create: structurally incorrect data entering the permanent record and triggering automated clinical actions that the coding clinician did not intend.

Why SNOMED Matters

SNOMED CT is not "just labels." NHS England describes SNOMED CT as allowing clinical thoughts or phrases to be represented as structured clinical concepts, making information understandable by both humans and software. All GP systems in England capture clinical terms using SNOMED CT.

SNOMED codes drive clinical functionality far beyond the individual consultation. QOF — codes determine disease registers, quality indicators, and practice payments. A patient incorrectly coded with "asthma" joins the asthma register, triggering annual review recalls, medication prompts, and QOF reporting. Prescribing decision support — coded allergies and diagnoses feed drug interaction warnings and contraindication alerts. Recalls and monitoring — a coded diabetes diagnosis triggers HbA1c monitoring, retinal screening invitations, and annual review recalls. Population health — coded data supports surveillance, outbreak response, and research. Insurance and employment — certain codes appear on medical reports, insurance questionnaires, and regulatory notifications (DVLA, CAA). A patient incorrectly coded with a seizure disorder may lose their driving licence. Safeguarding — coded entries related to domestic abuse, child protection, and vulnerability feed multi-agency processes.

A SNOMED code is a trigger — for recalls, registrations, warnings, payments, reports, and clinical actions that may follow the patient for life.

The Specific Risks of AI-Assisted Coding

Wrong diagnosis code. "Viral URTI" coded as "pneumonia" — triggering different follow-up, different safety-netting, and different expectations in the record.

Overly certain code when the consultation was uncertain. "Possible asthma" coded as confirmed "asthma" — creating a lifelong disease register entry, annual review recalls, and insurance implications from a single uncertain consultation. This is the most common and most consequential AI coding error — converting clinical nuance into false diagnostic certainty.

Missed negative finding. The clinician assessed and excluded a red flag, but the AI did not code the exclusion — leaving the record ambiguous about whether the assessment was performed. If the case is later reviewed, the absence of a coded negative may be interpreted as absence of assessment.

Coding suspected as confirmed. The clinician thinks "this might be asthma." The AI codes "asthma." The system treats the patient as asthmatic from this point forward — with all the register, recall, prescribing, and reporting consequences that follow.

Historical coded as active. A resolved condition coded as current — triggering monitoring and prescribing implications for a condition the patient no longer has.

Incorrect severity, laterality, or indication. Mild coded as severe. Right coded as left. A medication prescribed for one indication coded against a different diagnosis. Each creates downstream clinical confusion that may not be noticed until a future clinician makes a decision based on the incorrect code.

Incorrect safeguarding or mental health context. Sensitive codes applied or omitted inappropriately — with significant consequences for the patient and for multi-agency processes.

Why Structured Data Needs Human Judgement

Narrative and code are not equivalent representations. "Possible asthma — trial of inhaler, review in 4 weeks" preserves clinical uncertainty. SNOMED code "195967001 — Asthma" is a definitive diagnostic entry with no uncertainty qualifier. "Chest pain, likely musculoskeletal, no red flags" is appropriate for a low-risk presentation. Coded as "angina pectoris," it triggers cardiology pathways and cardiovascular disease registration. "Low mood, no suicidal ideation, work stress, self-management plan" is a complete narrative. Coded as "depressive disorder," it creates insurance and employment implications the consultation did not warrant.

Human judgement ensures the code reflects what was actually assessed — not what the AI inferred from conversation.

What Clinicians Should Verify

Does the code reflect what was assessed, not what was hypothetically discussed? Is it confirmed, suspected, excluded, historical, or family history — and is the qualifier correct? Does it create QOF, recall, or register implications — and are those appropriate? Could it affect insurance, safeguarding, employment, DVLA, or prescribing? Has uncertainty been preserved? Does the free text support the code? Would another clinician reading this code in 5 years accurately understand what happened?

Where iatroX Fits

iatroX is the evidence and verification layer. After the scribe generates notes and codes, clinicians can verify management against UK guidance, check what should be documented and safety-netted, verify prescribing with calculators, and turn the question into CPD. The scribe drafts. iatroX helps verify.

Before saving an AI-generated note, use iatroX to check the clinical question behind the documentation →

Share this insight