Why Healthcare Models Have to Explain Themselves
Usman Ghani · January 18, 2026 · 2 min read
Healthcare · Machine Learning · Ethics
In late 2023 I co-authored a paper at ACIT on explainable AI for healthcare. The premise was simple: medical models keep posting state-of-the-art accuracy numbers on benchmark datasets, and adoption is still glacial. Why?
The short answer: clinicians are not allowed to trust black boxes, and they shouldn't be.
What "explainable" actually has to mean
In a tech-Twitter sense, "explainability" usually means a heatmap. The model classified this chest X-ray as pneumonia; here's a Grad-CAM showing which pixels mattered. Useful, but insufficient. A clinician needs three things from an explanation:
- Locality — what specifically in this image (or this patient's history) drove the call?
- Generality — does the feature the model is using correspond to something pathophysiologically meaningful, or is it leaking from artifacts (scanner type, patient positioning, scan-room demographics)?
- Counterfactuals — what would have to change for the prediction to flip?
A heatmap gives you (1). It doesn't give you (2) or (3). And without (2) and (3), the explanation is decorative.
The accuracy–interpretability frontier isn't fixed
A common framing is that there's an unavoidable trade-off: linear models are interpretable but weak; deep nets are accurate but opaque; pick your poison.
In practice, the frontier is much more flexible than that:
- Sparse, monotonic gradient-boosted trees can match deep nets on tabular medical data while remaining inspectable per-feature.
- Concept bottleneck models force a deep net to route its prediction through a layer of human-named medical concepts ("ground-glass opacity," "consolidation"), so explanations are in clinical vocabulary.
- Prototype networks classify by similarity to training examples a clinician can pull up and look at.
The trade-off exists, but you can usually buy a lot of interpretability for very little accuracy if you design for it from the start, instead of bolting on SHAP at the end.
The piece I keep coming back to
The deepest version of this problem isn't technical. It's about the locus of responsibility.
When a clinician makes a wrong call, there's a well-developed system — review boards, malpractice law, professional norms — for assigning and absorbing that responsibility. When a model makes a wrong call, that machinery doesn't exist yet. And so a clinician using a black-box model is being asked to personally underwrite a recommendation they cannot interrogate.
You can't fix that with better algorithms. You fix it by building models the clinician can argue with.
That, I think, is the real bar for healthcare AI: not "is it accurate," but "can a thoughtful doctor disagree with it for a stated reason."
We're not there yet. But we're closer than the doom narrative suggests.