When the AI Summary Is Wrong: The Safety Risks of Patient-Facing Medical Notetakers

Patient-facing AI notetakers address a real problem: patients forgetting critical medical advice. But the harm profile is different from clinician-side scribes — and in some ways more concerning, because the failure mode is a patient acting on an inaccurate AI summary without any clinical review step.

When a clinician-side scribe produces an inaccurate note, the clinician reviews and corrects it before it enters the medical record. The review is the safety layer. When a patient-side notetaker produces an inaccurate summary, the patient may act on it directly — booking the wrong test, taking the wrong dose, missing a red-flag warning, gaining false reassurance about a symptom that warranted urgent assessment, or sharing incorrect information with caregivers who then make decisions based on it.

Failure Modes

Omission. The summary captures the diagnosis and treatment but omits the safety-netting — the specific red-flag symptoms that should prompt urgent return. The patient leaves with a plan but without the critical "come back if" warnings that protect against deterioration. This is arguably the most dangerous failure mode because the patient feels informed (they have a summary) while missing the safety information that matters most.

Over-simplification. "Your doctor said you might have asthma" when the clinician actually said "this could be asthma, but I want to rule out a cardiac cause first — that's why I'm ordering this test." The nuance — the differential diagnosis, the uncertainty, the investigation rationale — is lost in plain-language translation. The patient may understand "I have asthma" rather than "we're investigating to work out what this is."

Hallucination. The AI generates a recommendation that was never made — "your doctor recommended starting vitamin D supplements" when vitamin D was not discussed. Or "your doctor said to reduce your blood pressure medication" when no medication change was discussed. The patient acts on a recommendation that does not exist.

Wrong priority. The summary emphasises a minor point (lifestyle advice about reducing salt intake) while burying the critical action (urgent blood test needed this week). The patient focuses on the prominent summary point and misses the time-sensitive action that was the primary reason for the consultation.

Accent and audio quality. AI transcription accuracy varies with accent, speaking speed, background noise, overlapping speech, and medical terminology pronunciation. "Amlodipine" misheard as "amitriptyline" propagates through the entire summary chain — producing a patient summary that names the wrong medication. Drug name errors in patient summaries are potentially the highest-risk single failure mode.

Multi-party confusion. Consultations involving interpreters, caregivers, family members, or multiple clinicians create transcription complexity. The AI may attribute statements to the wrong speaker, conflate separate conversations, or miss interpreted content entirely.

What Good Products Should Do

Show the source — distinguish between "your doctor said" and "the AI suggests." Flag uncertainty — if the transcription quality is poor or the clinical content is ambiguous, say so rather than generating a confident summary from uncertain data. Include a disclaimer — remind users that summaries may be incomplete and should be confirmed with their healthcare professional. Allow correction — let patients flag or edit inaccurate content. Provide transcript access — so the patient (or clinician) can check what was actually said versus what the AI summarised.

Aide Mirror explicitly reminds users that AI can make mistakes and encourages them to confirm uncertainty with healthcare professionals. Kin says its summaries go through transcription, clinical narrative generation, and user-facing summarisation with evaluation at different stages. These are minimum responsible standards for the category.

What Clinicians Should Do

Communicate clearly and specifically — because the AI will summarise what it hears. Provide specific safety-netting with named red flags and timeframes — not generic "if things get worse" advice. Summarise the plan verbally at the end of every consultation. Document the plan in the clinical record clearly and completely — so that if the AI summary contradicts the record, there is an authoritative source to reference. And if a patient returns with a concern based on an AI summary, address it constructively rather than dismissively — "let me clarify what I actually recommended."

iatroX is citation-first: clinical answers show where they came from. That provenance model matters whether the output is a clinical answer, a patient summary, or a consultation note.

Use iatroX for citation-first clinical answers →