From “human-in-the-loop” to “human-at-the-helm”: how to prevent automation bias and preserve clinical agency in the AI era

Introduction

It’s 3 AM on a busy shift. An AI decision support tool offers a confident recommendation for a complex patient. It sounds plausible. It matches the pattern you half-remember. The department is overwhelmed, and your cognitive bandwidth is zero. You click “accept” without fully checking the source.

The risk here isn’t that clinicians will stop thinking. It is that systems will quietly train them to rubber-stamp.

In the rush to adopt artificial intelligence in the NHS, "human-in-the-loop" has become the standard safety slogan. But there is a subtle danger in that phrase. Being "in the loop" often implies a passive role: the AI decides, and the human signs it off. To practice safely in 2025, we need to shift the paradigm to “human-at-the-helm.” This means the human leads the reasoning, and the AI supports the navigation.

1. What “automation bias” actually is (and why it’s predictable)

A plain-English definition

Automation bias is the undue deference to an automated recommendation, even when contradictory cues exist or when verification is possible. It is the psychological tendency to assume the machine is right and your intuition—or the patient in front of you—is wrong.

This isn't a new phenomenon; it has been observed for decades in safety-critical environments, from nuclear power plants to flight decks (PMC, ScienceDirect).

Why it happens

In a high-pressure clinical environment, automation bias is almost a survival mechanism.

Cognitive load: When bandwidth is low, the brain seeks the path of least resistance. Accepting a computer’s prompt is metabolically cheaper than deriving a plan from first principles.
Perceived objectivity: We are conditioned to view computers as mathematical and unbiased, leading to an assumption that their output is "truer" than human judgement.
Role drift: When automation takes over routine tasks, humans can drift from being "operators" to being "monitors." Vigilance drops, and the ability to spot subtle errors degrades (ResearchGate).

2. The aviation lesson: “trusting the autopilot over your own eyes”

Aviation learned the hard way that automation can erode basic handling skills. Two historic accidents offer chilling lessons for modern healthcare.

Case study 1: Asiana Airlines Flight 214

In 2013, a Boeing 777 crashed while landing in San Francisco. The NTSB investigation found that the pilots had over-relied on automated systems they did not fully understand. They believed the autothrottle would maintain safe speed, even though they had inadvertently deactivated it.

Clinical translation: If the operator’s mental model of the system is wrong, "oversight" becomes theatre. You cannot supervise an AI if you don't understand its logic or limitations.

Case study 2: Air France Flight 447

In 2009, an Airbus A330 crashed into the Atlantic. Ice crystals blocked the airspeed sensors, causing the autopilot to disconnect. The pilots, suddenly handed manual control in a high-stress situation, failed to recognise a stall and responded incorrectly.

Clinical translation: Automation masks complexity. When the AI support drops out—or is hallucinating—is the clinician still prepared to "fly the plane" and manage the patient from first principles?

The UK Civil Aviation Authority has explicitly warned about reliance on automation reducing manual competence. Medicine must heed the same warning.

3. The clinical analogue: “human-in-the-loop” can become rubber-stamping

Oversight vs. Agency

There is a critical distinction between oversight and agency. Oversight often just means reviewing the output—a "check-box" exercise at the end of a workflow. Agency means controlling the question, setting the constraints, verifying the evidence, and making the final decision.

Evidence in healthcare

Systematic reviews have confirmed that automation bias exists in healthcare decision support. While these tools generally improve performance, they can introduce new types of errors where clinicians accept incorrect advice because it looks authoritative (PMC, JAMA Network).

Doctors in the UK are already grappling with this. Research commissioned by the GMC highlights that while doctors see AI as assistive, they are acutely aware that professional responsibility for the final decision rests with them.

4. The core concept: “Human-at-the-Helm”

To prevent automation bias, we need to redefine the relationship.

The Driver (Clinician): Frames the clinical question, sets the context (frailty, social situation, red flags), chooses which source to trust, and makes the final judgement.
The Navigator (AI): Retrieves relevant guidance, surfaces uncertainties, offers structured alternatives, and provides provenance ("Here is where I found this").

In this model, the AI holds the map, but the clinician holds the wheel.

Where iatroX fits

This philosophy is central to iatroX. It is designed for grounded retrieval and traceable provenance. By forcing the user to engage with the source material (the citation) rather than just the answer, it keeps the clinician in command of verification. It is an anti-sedation tool, designed to prevent the passive acceptance of fluent text.

5. Friction as a feature: good clinical AI sometimes slows you down

In consumer tech, friction is bad. In safety-critical systems, friction can be a feature. Sometimes, the correct design outcome is to make the user pause.

Designed friction

Clinical decision support systems often use "hard stops" (you cannot proceed) or "soft stops" (you must acknowledge) to prevent error. This "designed friction" forces a cognitive re-engagement (PMC).

Features that preserve agency

To keep the human at the helm, AI tools should incorporate specific UI patterns:

"Show me the source": Making source verification a required step for high-risk recommendations.
"Agree / Override": Requiring a reason for accepting or rejecting a complex suggestion.
Disconfirming cues: An AI should list not just why a diagnosis fits, but what would make it wrong (e.g., "Consider Pulmonary Embolism, unless D-dimer is negative").
Actionable confidence: Instead of a generic disclaimer, use specific warnings like "Limited evidence in paediatric populations."

6. Cognitive offloading vs cognitive atrophy

The risk: offloading thinking rather than tasks

Cognitive offloading is beneficial when it reduces working memory burden (like using a checklist). It becomes dangerous when it offloads the clinical reasoning itself (PMC).

Good Offloading (Task Support)	Bad Offloading (Agency Erosion)
Locating specific guidance	Outsourcing differential formation
Summarising treatment options	Outsourcing risk stratification judgement
Reminding about contraindications	Outsourcing "stop/go" escalation decisions
Structuring documentation	Copy-pasting conclusions without checking

The design goal

The goal of clinical AI should be to offload the clerical burden, not the clinical conscience.

7. A clinician’s “Human-at-the-Helm” checklist

The 30-second operating rule

Use AI to retrieve and structure information, never to decide. Always verify the recommendation against the linked source, especially for high-stakes decisions like prescribing or discharge.

The 5-point agency protocol

State the clinical question precisely (define population and setting).
Check the provenance (where did this answer come from?).
Run a red-flag scan (what must not be missed?).
Seek disconfirming evidence (what would change my management?).
Document your reasoning (use AI to help structure the note, but ensure the judgement is yours).

Conclusion

The goal of clinical AI is not to replace judgement, but to make the clinician more present—more informed, more deliberate, and more in control.

Tools like iatroX are built to keep the clinician "at the helm" via grounded retrieval, traceable sources, and deliberate verification workflows. We must move beyond the passive safety of being "in the loop" to the active responsibility of driving the decision.