MHRA AI Airlock & the value of regulatory sandboxes (2025): a playbook for UK health AI — with comparisons to the EU, US and Singapore

Executive summary

For innovators and adopters of clinical AI in the UK, 2025 is a year of regulatory momentum. The MHRA AI Airlock, the UK's regulatory sandbox for AI as a Medical Device (AIaMD), has moved from a successful pilot to an expanded second phase, creating a supervised pathway to test novel AI safely in real NHS settings. In parallel, new Post-Market Surveillance (PMS) regulations, effective from June 2025, have tightened the requirements for all device manufacturers to monitor real-world performance (GOV.UK).

For NHS leaders, this means that "sandbox" participation is becoming a credible "stamp of readiness" for new technologies. While not a market approval in itself, the evidence, safety mitigations, and post-market surveillance patterns developed in the Airlock are designed to de-risk adoption and accelerate the subsequent NICE Early Value Assessment (EVA) and DTAC procurement pathways. This is happening as other major global regulators, from the EU and the US FDA to Singapore, are establishing their own frameworks for managing iterative and adaptive AI, creating a global conversation on how to innovate safely.

What the MHRA AI Airlock is (and isn’t)

The AI Airlock is a time-boxed, supervised environment where manufacturers of novel AIaMD can run controlled evaluations of their products within a real NHS context. Its purpose is to identify and address regulatory gaps and challenges before a tool is released for wide-scale use. The outputs are not just for the manufacturer; they include public product reports and cross-regulator learnings that help to shape future MHRA guidance (GOV.UK).

Crucially, the Airlock is not a shortcut to approval, nor is it a replacement for the established assurance routes of NICE or the DTAC. It is a collaborative space to generate the very evidence that those other bodies will require.

What Airlock participation delivers

An evidence playbook: The process generates real-world testing plans and templates for Post-Market Surveillance (PMS), including dashboards that can track model drift and AI/human agreement rates to mitigate automation bias.
Assurance patterns: It provides clearer lines on defining a product's intended use, its validation requirements, and how to manage changes for adaptive models, which helps to lower friction with NICE and NHS buyers later on.
A market signal: While not an approval, a published sandbox report is a credible “stamp of readiness” that can significantly shorten due-diligence cycles for procurement teams, especially when presented alongside DTAC and DCB artefacts.

The UK “stamps” that actually unlock adoption

The Airlock is one part of a wider UK assurance ecosystem. The "stamps" that unlock procurement and deployment are:

NICE Early Value Assessment (EVA): This provides a conditional "use while evidence is generated" recommendation for promising but early-stage technologies. NICE has already issued EVAs for AI triage tools in dermatology and fracture detection.
DTAC (Digital Technology Assessment Criteria): The mandatory baseline for NHS procurement, covering clinical safety, data protection, security, interoperability, and usability.
DCB0129/0160: The legally mandated clinical risk management standards for the manufacturer (0129) and the deploying NHS organisation (0160).
Setting-specific guidance: For widespread use-cases like ambient scribing, NHS England has issued its own practical adoption rulebook.

How the UK compares internationally

European Union: The landmark EU AI Act classifies most medical AI as "high-risk" and mandates that every Member State establish its own national AI sandboxes by August 2026. The UK's Airlock is functionally similar, though it operates outside the formal EU regime.
United States (FDA): The FDA does not have a central sandbox. Instead, it has finalised its guidance on Predetermined Change Control Plans (PCCP). This allows a manufacturer to pre-agree with the regulator on a plan for how their AI will learn and change post-market, enabling iterative updates within a controlled and auditable framework.
Singapore: The Health Sciences Authority (HSA) is in the process of establishing its own AI-SaMD sandbox with a focus on public-sector providers, creating an instructive model for rapid, state-backed evaluation.

Who should apply to the Airlock—and what to bring

The Airlock is best suited for novel or adaptive AIaMD where the current regulatory pathways do not fully cover the product's learning behaviour or intended use. This includes many ambient scribing tools and multi-modal diagnostic aids. A strong application requires a crisp intended use statement, a clear validation plan, a policy for managing model updates, and draft PMS metrics, ideally designed to dovetail with a future NICE EVA.

Designing evaluations that travel

To maximise the value of a sandbox pilot, the evaluation must be designed with the next steps in mind. The endpoints should align with what NICE will require for an EVA: accuracy and sensitivity where applicable, but also operational metrics like time-to-decision, minutes saved, and robust equity analyses. The outputs—public reports, dataset descriptors, and audit trails—will raise buyer confidence and shorten subsequent DTAC cycles.

Case signals & policy context (2025)

The selection of the Phase 2 cohort, announced in October 2025, highlights the regulatory challenges the MHRA is focused on. The inclusion of tools for AI note-taking, cancer pathology, eye disease detection, and blood-test interpretation shows a clear focus on where intended use definitions and post-market surveillance are most pivotal (Digital Health). This is happening as NHS England's ambient AI guidance sets a practical template for how to manage the clinical safety and legal compliance of these tools, even outside of a formal scribing use-case.

Risks & Misconceptions

"Sandbox = approval." It is not. Treat the Airlock outputs as a source of evidence and a set of design patterns, not as a marketing authorisation. You will still need to complete the usual UKCA/approval route and satisfy NICE and DTAC requirements.

Under-scoped PMS. The learnings from the Airlock are clear: for adaptive AI, a plan to monitor for model drift and human-AI agreement is not optional; it is a core safety requirement.

UK vs EU divergence. While the UK's sandbox can often move faster procedurally, the EU AI Act's sandboxes will become the standard on the continent. Companies operating in both markets should plan to generate evidence packs that can satisfy both regimes.

FAQs

Does the AI Airlock replace the need for UKCA/MHRA approval?
- No. It is a process to inform regulation and evidence generation. It does not replace the standard UKCA/approval route, nor the need to satisfy NICE and DTAC.
What does the FDA do instead of a central sandbox?
- The FDA's Predetermined Change Control Plan (PCCP) allows manufacturers to pre-agree on a safety-governed plan for how their AI will change and adapt post-market.
Will the EU AI Act's sandboxes affect UK deployments?
- If you operate in both the UK and Europe, yes. The AI Act requires national sandboxes and sets out extensive duties for any medical AI classified as "high-risk." It is wise to plan to meet both regulatory regimes.