Introduction
We are past the point of asking if doctors should use AI. You probably already are. You might use ChatGPT to draft a difficult email to a colleague or to polish a discharge summary. But the moment you type a clinical question—like "dose of gentamicin for 60kg female with eGFR 35"—you enter a safety minefield.
In 2025, the market has split into three distinct categories: the Generalist, the Academic, and the Clinical Co-Pilot. This article compares the three heavyweights—ChatGPT, OpenEvidence, and iatroX—to answer one question: which one is actually safe for making decisions about patients?
1. The "generalist" problem: ChatGPT & Grok
The Pro: Incredible fluency. ChatGPT is a master of language. If you need to rewrite a patient letter to sound "more empathetic" or draft a complaint response, it is unbeatable. It understands tone and context better than any dedicated medical tool. The Con: "Hallucination Roulette." ChatGPT is a probabilistic engine. It predicts the next likely word, not the next true fact. It doesn't "know" the NICE guideline for hypertension; it knows what the internet generally says about hypertension. This means it can confidently invent a drug dose or cite a guideline that doesn't exist. The Verdict: Use for communication, not calculation.
2. The "academic" heavyweight: OpenEvidence
The Pro: It’s the "Google Scholar" of AI. OpenEvidence is rigorous. It reads 100% peer-reviewed papers and provides answers that are deeply grounded in the medical literature. It rarely lies because it is constrained to high-quality data. The Con: It can be too dense for the ward. When you ask for a first-line antibiotic, you often get a literature review of three different trials rather than a simple "Amoxicillin 500mg TDS." Crucially for UK clinicians, it is often US-centric, prioritising FDA approvals and American guidelines over NICE or the BNF. The Verdict: Use for deep research and complex, rare cases where standard guidelines don't apply.
3. The "clinical co-pilot": iatroX
The Pro: The "Grounded" Middle Way. iatroX is designed to sit between the fluency of ChatGPT and the rigidity of OpenEvidence.
- UK-First: Unlike ChatGPT, it doesn't just "know" medicine; it retrieves specific UK guidance (NICE/CKS/SmPC).
- Ward-Ready: Unlike OpenEvidence, it answers in "Bullet Points" designed for a 2-minute corridor decision, not a 20-minute library session. The Killer Feature: Automatic CPD. iatroX is the only tool in this list that automatically logs your query as a CPD entry. It turns your daily curiosity into evidence for your appraisal, saving you hours of admin time. The Verdict: The daily driver for the ward.
Summary table: the safety traffic light
| Task | ChatGPT / Grok | OpenEvidence | iatroX |
|---|---|---|---|
| Writing Letters / Emails | 🟢 Best | 🔴 Too Dry | 🟡 Good (but structured) |
| Checking Doses (BNF) | 🔴 Unsafe (Hallucination Risk) | 🟡 Safe (but US units possible) | 🟢 Safe (Links to BNF) |
| Researching Rare Diseases | 🟡 Good for ideas | 🟢 Best (Deep Lit Search) | 🟡 Good (Guideline focus) |
| UK Guideline Check (NICE) | 🔴 Unreliable | 🟡 Variable (US bias) | 🟢 Best (Native Integration) |
| Logging CPD | 🔴 No | 🔴 No | 🟢 Automatic |
Conclusion
If you want to write a poem about cardiology, use ChatGPT. If you want to know the latest trial data on a rare lymphoma, use OpenEvidence. If you want to know what to prescribe for a UTI in a pregnant patient at 3 AM in a UK hospital—and get CPD points for checking—use iatroX.
