OpenEvidence vs ChatGPT-5 vs Medwise AI (plus iatroX): a UK clinician’s guide to AI clinical tools under DTAC & NICE ESF

Featured image for OpenEvidence vs ChatGPT-5 vs Medwise AI (plus iatroX): a UK clinician’s guide to AI clinical tools under DTAC & NICE ESF

Executive summary

The landscape of AI tools aimed at clinicians is expanding at an unprecedented rate. In 2025, OpenEvidence announced a $210 million Series B funding round, citing daily use across over 10,000 US care centres. OpenAI’s powerful ChatGPT-5 has launched with stronger reasoning and multimodal capabilities. And closer to home, Medwise AI is positioning itself as a UK-centred clinical search tool that can integrate local NHS guidelines (PR Newswire, OpenAI, Medwise AI).

For UK healthcare providers, this wave of innovation offers immense potential to improve efficiency and augment decision-making. However, any tool considered for deployment must be evaluated through the rigorous lens of UK governance. This article provides a practical guide for clinicians and managers on how to assess these emerging platforms using the mandatory frameworks of the NHS DTAC (Digital Technology Assessment Criteria) and the NICE Evidence Standards Framework (ESF) before any clinical use (NHS Transformation Directorate, NICE).

Who’s who (quick profiles with UK relevance)

OpenEvidence

  • What it is: An AI-powered medical search engine that provides generative summaries of clinical evidence for clinicians, with a heavy emphasis on speed and clear links to source material.
  • Notable signal: Its recent $210 million Series B funding, co-led by GV (Google Ventures) and Kleiner Perkins, and its reported daily use by 40% of US physicians, establishes it as a major international benchmark for AI-assisted evidence search (PR Newswire, Fierce Healthcare, OpenEvidence).
  • Why UK readers care: It represents a powerful, well-funded US model against which UK-focused tools can be compared.

ChatGPT-5 (OpenAI)

  • What it is: The latest, most powerful version of the general-purpose model from OpenAI, featuring improved reasoning, advanced coding capabilities, and multimodal (text, image, voice) inputs and outputs.
  • Why it matters in care settings: It is exceptionally good at rapidly prototyping tools and drafting content like patient letters, checklists, and teaching prompts. However, as a generalist model, it requires significant clinical validation and a robust governance wrapper before it can be safely used with patient data or within an EHR (OpenAI).

Medwise AI

  • What it is: A clinician-facing search platform designed specifically for the UK, with the key capability of retrieving information from both national sources and a Trust’s own local guidelines and documents.
  • Evidence & traction: The platform has been highlighted in Innovate UK case studies and is undergoing pilots to validate its ability to retrieve local guidance effectively. Its enterprise tier promises EHR integration and a claims-based "zero risk of hallucinations" stance by being strictly grounded in its curated sources (Medwise AI, Innovate UK Business Connect, PMC).

iatroX (disclosure)

  • What it is: iatroX provides UK clinicians with evidence-linked Q&A (Ask iatroX) and a structured thinking tool for differential diagnosis (Brainstorm). These features are designed for educational and reference purposes, not for live, patient-specific diagnostic advice. Our team is actively exploring partnerships to broaden trusted content coverage and deepen workflow integrations.

UK buyer’s rulebook (what good looks like)

Before procuring any AI tool, UK practices and Trusts must ensure it meets two key national standards:

  1. DTAC (Digital Technology Assessment Criteria): This is the NHS baseline for all digital health tools. It provides a comprehensive assessment framework covering clinical safety, data protection, cybersecurity, interoperability, and usability. Always ask a vendor for their completed DTAC pack as a first step in procurement (NHS Transformation Directorate).
  2. NICE Evidence Standards Framework (ESF): This framework sets out the evidentiary expectations for digital health technologies. Use the ESF to gauge the credibility of a tool's claims about its clinical and economic value, and to understand whether it may be subject to a full NICE evaluation (EVA) or other guidance routes (NICE).

Comparison matrix (what to evaluate)

FeatureOpenEvidenceChatGPT-5Medwise AIiatroX
Knowledge ProvenanceGrounded in peer-reviewed literature with inline citations.General model; requires explicit retrieval-augmented grounding for safety.Retrieves from national & local guidelines; claims zero hallucination.Strictly grounded in curated UK guidelines and research with source links.
Local Guideline Coverage (UK)No native local coverage; focuses on global literature.No local coverage unless specifically fine-tuned.Key feature: designed to ingest and search local Trust documents.Focused on national UK guidelines (NICE, CKS, BNF).
IntegrationPrimarily a standalone web/mobile app.Highly flexible API for integration into other apps.Enterprise tier offers EHR, SSO, and API availability.Standalone web/mobile app; exploring partnerships for integration.
Claims & EvidenceCites large-scale US adoption (10k+ centres) and funding as validation.General model; performance depends on the specific use case and implementation.Cites Innovate UK case studies and pilot data on time-saving.Focuses on accuracy and reliability metrics within its UK-guideline "walled garden."
Regulatory PosturePositioned as an information tool.Not a medical device; any clinical app built on it may be.Provides DTAC readiness statements; positioned as an information tool.Positioned as an educational/reference tool, not a medical device.

Role-based use cases (primary care, hospital, pharmacy)

  • GP/registrar: For a 30-second check of a management step, compare an OpenEvidence or Medwise AI output directly against the relevant NICE CKS page or local pathway before taking action. Use iatroX Brainstorm for tutorial-style differential diagnosis structuring in a learning environment.
  • Hospitalist/ACP: Draft a discharge summary using ChatGPT-5's powerful language capabilities, then use Medwise AI or OpenEvidence to verify medication doses and interactions against the BNF or Trust guidance, ensuring all sources are recorded in the final note.
  • Pharmacist: Perform rapid formulary look-ups using Medwise AI (for local documents) and cross-check safety information using its national guideline access. Log the query and citations for a clear audit trail.

Implementation checklist (for PCNs/Trusts)

  1. Define the use case: Are you looking for a search/Q&A tool, an educational aid, or a drafting assistant?
  2. Demand citations by default: Do not procure any "black box" summary tools.
  3. Collect the vendor’s DTAC pack and map their evidence claims against the NICE ESF evidence tiers.
  4. Run a 4–8-week pilot with clear KPIs: time-to-answer, concordance with guidelines, user satisfaction, and any safety flags.
  5. Maintain a human-in-the-loop sign-off process, version logging for the AI model, and a clear audit trail of its use.

Risks & mitigations

  • Hallucinations/omissions: Prefer tools with inline sources and native UK guideline retrieval, like Medwise AI and iatroX. Always require clinician verification of outputs.
  • Data protection: Ensure a Data Protection Impact Assessment (DPIA) is completed. Avoid sending any patient-identifiable information to non-enterprise, public-facing endpoints. Check the vendor’s policies on handling Protected Health Information (PHI).
  • Governance drift: Technology evolves quickly. Re-review a vendor's DTAC evidence when they release major upgrades and keep an eye on NICE ESF/EVA updates for shifting expectations on evidence standards.

Conclusion & call-to-action

The new wave of AI clinical search tools offers immense potential, but for UK clinicians and healthcare organisations, the bar for safe adoption is set by the DTAC and NICE ESF frameworks, not by marketing gloss or US adoption metrics.

The most effective strategy is to start with a contained pilot of one evidence-grounded search tool (like OpenEvidence or Medwise AI) alongside an education-first LLM workflow (using ChatGPT-5 for drafting and iatroX Brainstorm/Ask for structured learning). Measure the outcomes, gather feedback, and then scale the use of these powerful new co-pilots deliberately and safely.


Share this insight