DxGPT vs ChatGPT for Diagnosis: What 500,000 Users and 6,000 Doctors Tell Us

Half a million people have used DxGPT. Over 6,000 doctors in Madrid's public health system have access to a localised version integrated into their electronic medical records. It has been piloted in primary care, is expanding into two major hospitals, and Microsoft's CEO personally unveiled it. For a free tool built by a non-profit foundation, those are remarkable numbers.

Meanwhile, clinicians worldwide are quietly typing symptoms into ChatGPT, Gemini, and Perplexity for diagnostic support — often getting impressively coherent but unverifiable responses. GMC research found over a quarter of UK doctors have used some form of AI in practice, with many using general-purpose tools for clinical queries.

The question that matters is not "does AI help with diagnosis?" — it clearly can. The question is whether a purpose-built diagnostic tool like DxGPT outperforms a general-purpose model like ChatGPT, and what the practical implications are for clinicians.

What DxGPT Actually Is

DxGPT is a GPT-4-based clinical decision support tool developed by Foundation 29, a non-profit founded by a Microsoft software engineer whose son was diagnosed with Dravet syndrome after a year-long diagnostic odyssey. The tool is designed specifically for differential diagnosis generation — you input a clinical description, and it returns a ranked list of five possible diagnoses with reasoning, compatible symptoms, and recommended investigations.

It is free, multilingual, GDPR-compliant with automatic anonymisation, and explicitly positioned as decision support rather than a medical device. It runs on Microsoft Azure and uses GPT-4o and o1 models trained on both public and private healthcare datasets.

Its particular strength is rare diseases — the clinical territory where diagnostic delays average five to six years and where individual clinicians are least likely to have seen the presentation before. A medRxiv pre-print evaluation found that DxGPT's top-5 diagnostic accuracy was comparable to hospital clinicians on complex paediatric and rare disease cases.

How ChatGPT Compares

ChatGPT is not designed for clinical diagnosis. It is a general-purpose language model that can generate plausible differential diagnoses when prompted with clinical information — but with no grounding in a curated medical dataset, no citation to primary sources, no rare-disease optimisation, and no clinical safety architecture.

The practical differences for a clinician are significant. DxGPT is optimised for the diagnostic reasoning task: it generates structured, ranked differentials with reasoning chains. ChatGPT generates free-text responses that may include diagnostic suggestions but also include caveats, tangential information, and potentially hallucinated content. DxGPT's performance improves significantly with richer clinical input — structured, detailed prompts yield better results. ChatGPT is more tolerant of vague input but correspondingly less precise in output.

DxGPT provides a focused tool for a focused job: widening the diagnostic net, particularly for rare and complex presentations. ChatGPT provides a broad tool that can be prompted to do diagnostic reasoning but without the clinical focus, validation, or safety architecture.

What UK Clinicians Should Know

DxGPT is not currently integrated into any UK clinical system. It is available for free on the web (dxgpt.app), but it has not undergone MHRA assessment, DTAC evaluation, or NHS governance review. For UK clinicians, it is best understood as a brainstorming tool — a way to generate a broader differential when you are stuck — rather than a clinical decision support system integrated into your workflow.

The most valuable UK-specific use case is the diagnostic challenge: the patient with vague, multi-system symptoms who has seen three specialists without a diagnosis. Inputting a detailed clinical description into DxGPT can surface rare conditions that the clinician might not have considered. But the output must be verified against trusted sources before any clinical action.

iatroX serves the verification function. When DxGPT suggests a diagnosis, the clinician can check the NICE/CKS management pathway via Ask iatroX, verify whether the suggested condition fits UK referral criteria, and ground the AI's suggestion in evidence before acting. The Brainstorm tool can then help structure the clinical reasoning for the suggested differential — working through the case step by step to determine whether the DxGPT suggestion is clinically plausible.

The combination is powerful: DxGPT widens the net; iatroX verifies the catch.

Should You Use DxGPT or ChatGPT?

For diagnostic brainstorming on complex or rare presentations: DxGPT. It is purpose-built, clinically validated to a degree, and optimised for the task.

For general clinical queries, guideline retrieval, and day-to-day clinical reference: neither. Use a guideline-grounded tool like iatroX that provides citation-first answers linked to NICE, CKS, SIGN, and BNF content.

For non-clinical tasks (administrative writing, teaching preparation, communication drafting): ChatGPT is fine, with the usual caveats about verification.

The key principle remains: use the right tool for the right job, verify every clinical output against trusted sources, and never outsource the judgement that your patients depend on.

Conclusion

DxGPT represents something genuinely interesting: a purpose-built diagnostic AI with real-world deployment at scale, a non-profit mission, and a focus on the hardest diagnostic challenges in medicine. Its 500,000 users and 6,000-doctor Madrid deployment make it the largest European AI diagnostic assistant pilot by a significant margin.

It is not a replacement for clinical reasoning. It is a cognitive forcing function — a way to widen the differential and surface possibilities you might not have considered. For UK clinicians, it works best in combination with a guideline-grounded reference like iatroX that can verify and contextualise the diagnostic suggestions within UK clinical practice.

The era of using general-purpose chatbots for diagnosis is ending. The era of purpose-built, clinically validated diagnostic AI is beginning. DxGPT is one of the earliest credible entrants.