Elevating AI governance out of the white space

When I wrote last year about safety in the white space (the gap between what existing clinical safety frameworks explicitly cover and what AI technologies actually do), I felt optimistic. The MHRA AI Airlock had launched, we developing local assurance frameworks, and conversations around the regulatory / liability questions for the use of AI in healthcare felt like they were moving forward.

Whilst the subsequent national conversation has moved more slowly then many would have hoped, the last 12 months of working with amazing colleagues in Greater Manchester has proven that my positivity was not misplaced. As we have built structures to help us support the use of AI tools in clinical practice the framing of conversations has moved past asking “how do we develop a model of clinical governance for AI?” to the more sophisticated question “can’t we stop treating AI as if its use was something other than a facet of our existing clinical governance?”

This (subtle) distinction matters because every time someone, whether an enthusiastic colleague or vendor, a cautious board member, or a well-meaning regulator, describes AI as a special case requiring its own bespoke governance regime, you can hear echoes of the mistakes that the NHS has made with pretty much every other technology. If we accept the case for AI exceptionalism we’ll be led to either build parallel lightweight processes that bypass the rigour we apply to everything else (the enthusiast’s error), or we will create burdensome, AI-specific pathways, that mean beneficial technology cannot be adopted (the sceptic’s error). Both of these options will produce bad outcomes. Both can be avoided.

This piece sets out questions that I think could help us safely avoid both of these failure modes. I’ll follow up with a shorted companion piece describing a domain based perspective on clinical governance architecture that looks at key areas relevant to AI that existing in our current approaches. 

Question 1: What is clinical governance actually for?

Clinical governance can get a bad press. In most of the organisations I’ve worked in, “governance meeting” isn’t a term that makes people sit up, smile and clear their diary… However, having spent time in GM helping to iterate a new framework for how we will deliver meaningful clinical governance I increasingly see disengagement from clinical governance as a professional own-goal

The original Scally and Donaldson 1998 BMJ paper defined clinical governance as:

"A framework through which NHS organisations are accountable for continuously improving the quality of their services and safeguarding high standards of care by creating an environment in which clinical excellence will flourish."

I think the word that should do the most work in that definition, but sadly is the most overlooked, is flourish. Governance exists to make good care possible, not only to prevent bad care from happening. These two outcomes are not equivalent or reciprocal. If we make the mistake of conflating these outcomes then we set ourselves the wrong target.

In any organisation, good clinical governance should help to deliver at least six things:

  • Assurance – that services are safe, effective, equitable.
  • Accountability – with clear ownership and authority so that decisions can be made.
  • Learning – using good and bad outcomes or experiences to hep improve practice.
  • Legitimacy – rigour that helps maintain the public and professional trust that gives the NHS the social licence to operate.
  • Strategic coherence – keeping individual decisions aligned with the overall organisational plan to maximise the chance of success and manage opportunity cost.
  • Flourishing – enabling the conditions in which clinical excellence can emerge and be sustained.

An AI governance framework that focusses only/mainly on assurance of a technology will therefore be, at best, half of what is needed. It may stop the worst outcomes but it won’t produce the best ones. The framing of this question when we think about AI or any other technology needs to be closer to: “does our approach make good, safe (AI-enabled) care more likely to happen, or only make it less likely that it could go wrong if it does happen?

The Francis Report into Mid-Staffordshire remains the foundational case study here. Smarter people than me have written about this, but a key message is that the failure in governance architecture occurred because the oversight and assurance had become detached from the purpose of care. 

Assurance without flourishing is a trap we know how to fall into. Good clinical governance can help us to avoid this with AI.

Question 2: Do we agree that AI is (mostly) not a special case?

There is a pervasive view, shared equally by enthusiasts and sceptics, that we should treat AI as being fundamentally different from everything else the NHS does. This instinct is (mostly) wrong, and should be resisted.

AI is a technology. Like every previous technology; pharmaceuticals, devices, electronic records, telemedicine, we can steward this within our existing clinical governance frameworks, with appropriate adjustments made to account for AI’s genuinely unique characteristics. 

Just because this something is new or exciting doesn’t mean the key questions that our governance process should ask need to change. When we look at new models of care or take decisions to use new medications, we apply consistent tests – Is it safe? Is it effective? Is it equitable? Is it affordable? Is it ethical? Is it reliable? Is it acceptable? 

All of these questions apply to AI and none are unique to AI. The critical difference for clinical leader, given the technical complexity, is how we develop the domain knowledge that we need to confidently answer these questions. Addressing this about training and education, not about new structures and processes. A clinician who can critically appraise a new trial, drug or care pathway can, with the right support, safely appraise an AI tool. The underlying skills of weighing evidence and considering context, supported by broad based professional judgement, do not change. 

However, we also need to acknowledge that there are respects in which AI genuinely is dissimilar to other tools and technologies. If properly understood, then these differences can be addressed and given attention within our existing frameworks, not outside them.

  • Dynamism. A drug doesn’t change after it has been accepted onto a formulary. An AI system can, (learning, data drift, model updates etc). However, post-market surveillance and pharmacovigilance is not a new idea. The principle translates well to AI, although how we transact this may be different.
  • Opacity. Many AI systems are not fully explainable to their users. This creates genuine challenges for informed consent and professional accountability. However, this can be seen as a question of degree; we’ve long used pharmacological agents whose mechanisms aren’t fully understood by the clinicians prescribing them, but whose intended outcomes, effects and monitoring requirements are.
  • Scale and speed. A device failure affects one patient. An AI failure can affect thousands simultaneously. This therefore is a question of risk appetite, not of redesigned governance structures.

The Epic Sepsis Model case from 2021 helps bring this question to life. External validation after deployment in many US hospitals, found it to have meaningfully worse performance than expected. Independent validation showed an AUC of 0.63 compared to the vendor’s internal figure of 0.76-0.83. The model missed significant numbers of cases of sepsis and generated lots of false positive alerts. Looking at this it’s easy to conclude that the problem was an underperforming model – but this is the wrong conclusion. The real issue is that this tool was deployed at scale without the external validation that almost any other clinical tool would have been subject to. Epic to their immense credit learned this and have published the tools to help do this.

Question 3: What should good governance protect clinicians from?

Clinicians still carry the professional accountability/liability for AI influenced decisions. Good clinical governance therefore needs to help protect staff from a number of well-documented traps.

Automation bias: often presented as a new concern related to AI, this idea was originally described decades ago in Parasuraman and Riley’s classic paper describing automation complacency. Despite its significance, this concept is not taught in clinical training leaving clinical staff unprepared to recognise the tendency to over-trust algorithmic outputs (especially when under time pressure). Additionally clinicians may under-investigate low risk AI outputs compared to the scenario where no AI output is available.

Responsibility diffusion and lack of transparency: when AI is involved, clinicians may feel that responsibility is shared with the algorithm or the vendor. Legally and professionally in England this is almost certainly not presently the case and the clinician who accepts an AI recommendation remains accountable. The liability scenarios described in a 2019 JAMA article remain worth considering, especially in relation to the ‘black box’ problem. 

Skill atrophy: if clinicians stop needing to interpret diagnostic results, formulate and triage differential diagnoses then their skills will diminish. This matters for patient safety because systems and networks fail and no digital system has 100% availability. 

Technology reshaping clinical pathways: our tools inevitably define our processes. A potential slow drift may occur where clinical workflows change to serve AI tools. Documentation standards, diagnostic definitions and approaches to patient interactions risk bending imperceptibly (but inexorably) towards being designed to serve AI tools rather than us ensuring that the tool working for the user. The resulting dependance and lock will not serve the NHS well in the future.

Displacing ethical decisions into technical decisions: when a moral choice (which patient gets prioritised, whose risk is acceptable) is encoded into an algorithm, the output acquires a false perception of neutrality. A clinician implementing the recommendation may feel they are making an objective numeric decision, when in fact they are executing an ethical choice made by a non-clinician (the algorithm’s designers).

A full answer to this question requires us to consider the reciprocity of clinical teams using AI tools. Good governance can protect clinicians but only if clinicians continue to exercise the judgement that governance makes possible.

Question 4: What should shadow IT tell us?

People are already using AI, whether for recording consultations, summarising information, drafting letters, searching evidence, or any one of the myriad other use cases. However, using consumer tools on NHS information, on infrastructure outside NHS control, in ways that raise real and substantial concerns about data governance, clinical safety, and information governance needs to be addressed.

Often the instinctive organisational response is prohibition. This doesn’t work.

  • Prohibition without alternatives incentives shadow IT to be more and more hidden underground. The clinician who currently uses openly uses a given tool and can be engaged will become a covert user.
  • Prohibition punishes the wrong people. Someone using AI to draft a routine letter faster is probably responding rationally to workload pressures. Criticising the user misses the organisational failure to provide a sanctioned, supported alternative.
  • Prohibition assumes enforceability that does not exist.
  • Prohibition signals a lack of organisational seriousness. “AI is dangerous and you may not use it” is not a credible position when it’s use is endemic in other industries and people know it can help with some/many of their tasks.

An unwillingness to accept and address shadow IT is a signal of failing governance. Like workarounds in core systems, shadow IT can tell you what your workforce needs. The right response to this is straightforward:

  • Acknowledge the reality Find out what is really happening, whether through anonymous surveys, workshops, or simple honest conversations. This helps create the conditions for good clinical governance to develop.
  • Provide supported alternatives For as many useful shadow use cases as possible, find ways to offer a governed, supported, assured equivalent. This doesn’t mean like for like but does mean understanding user need. Taking a position that “No AI use is currently sanctioned” is not governance. It’s abdication and displacement of responsibility. The national work on testing and scaling Co-Pilot is a live example of this being addressed.
  • Differentiate by risk Not all shadow IT carries equal risk. A clinician brainstorming teaching plans or case studies with no patient data entered is lower-risk than one pasting identifiable information into a consumer web tool. Good clinical governance allows us to focus attention where the risk actually sits.
  • Educate A clinician who understands why pasting patient data into a consumer LLM is a problem is a better governance asset than a clinician who has simply been told not to.

The answer to this question is not elimination, it’s about first finding ways to identify and manage the drivers, gaps and conditions that lead to the adoption of shadow IT in the first place. Then it’s about taking action to deliver sanctioned alternatives and better education wrapped in responsive governance.

Question 5: How should governance and process redesign interact?

The biggest benefits of technology don’t come from inserting new tools into existing processes. They come from redesigning processes around is now possible. I’ve written before that we often assess technologies that can be deployed into existing models of care very differently from technologies that change how care is delivered, and that the latter requires us to understand success factors for delivery and operational change as much as the evidence of benefit of the new tool.

This highlights a limitation of where our governance structures typically sit in a design timeline. Often, governance operates after a pathway has been crafted, aiming to provide assurance that what is proposed is safe and effective. This is governance assuring but not contributing to design. For AI-enabled transformation, we need to think about this differently with governance engagement and input from the outset. A shared approach to designing transformation (bringing together clinical and operational insight), with governance functions acting as a design partner rather than a late stage reviewer is what we should be aiming to achieve. Strong governance in this way will inform our design constraints and design opportunities in the context of the broader organisational picture. This enables an iterative process where governance contributes to the feedback loops that shape the design as it develops, and allows for a built-in evaluation mechanism to be simultaneously defined and agreed. This allows for a collaborative process that can close the loop, creating the long term learning function that defines good governance.

If we treat clinical governance only as late-stage approval function, we will capture incremental value from AI. To answer this question, we all need to think about how we integrate clinical governance into transformation design (and long term learning) from the get-go.

Question 6: Do we appreciate that vendor relationships are governance relationships?

This is the question I think is most often missed. Governance conversations frequently have a focus on the technical and ethical risks of adopting AI, but give much less attention to the commercial risks which need to be considered.

Gary Marcus and others highlight the fact that most if not all AI products are sold by companies whose business models are not yet clearly sustainable. Where growth is funded by venture capital, commercial models will depend on rapid user acquisition, loss-leader pricing, and the hope of eventual profits based on scale and survival. For the NHS this is may be acceptable as a defined, understood phase, but not as a permanent condition. Anyone who builds a clinical service around the current, artificially low, cost models without forward planning for what may happen when these change is going to face a very painful transition.

A case study that makes this real is the collapse of Babylon Health in 2023. At one point valued at over $4 billion, and endorsed by the then Secretary of State for Health as “the future of healthcare”, the company was sold for £500,000 when it collapsed into administration. At this point, the fact that they had lost money on every UK patient it served became clear. Fortunately for patients, GP at Hand was sold and continued, but what if it hadn’t been? What if a hundred thousand NHS patients had woken up to find their GP service gone overnight? This emphasises the real link between clinical governance and commercial relationships. 

Price escalation or loss of service when investor-subsidised pricing ends is not the only potential scenario. Acquisition by a hyperscaler that discontinues or repurposes the product; uncontrollable changes to the foundational model where your contract is with a specialist but your underlying exposure is to OpenAI or Google; regulatory or geopolitical disruption are all plausible, foreseeable risks. Clinical governance needs to be able to stress-tested these scenarios before contracts are signed, not allow them to be discussed and discovered afterwards.

To me, this means that the dominant NHS model of vendor engagement: transactional procurement followed by arms-length contract management, is not going to be adequate. AI vendors are not selling static products. When we enter into an ongoing relationship with live, dynamic systems we need to recognise this in clinical governance and use its strengths to focus on issues such as

 Partnership, not procurement. Creating joint responsibility for outcomes to drive shared governance. Timely engagement with clinical teams will allow the upfront challenge needed to avoid entering into contracts that offload all clinical risk to the healthcare team while reserving all intellectual property to the vendor.

  • Transparency is non-negotiable. Our governance structures need to be able to understand how the system works, what data it was trained on, how it performs elsewhere and how it is monitored. We need to help vendors understand that this is not about exposing commercially confidential information, but about being clear about how business models need to be compatible with clinical governance duties.
  • Exit planning. With most or all of our existing digital infrastructure we have failed to protect from vendor lock-in. This was a governance failure that very much happened slowly, then all at once. In this early stage of using AI in healthcare we have to use our governance structures to ensure we don’t make the same mistakes again. This means designing our pathways and contracts with business continuity and their end in mind. 
  • Performance management as a clinical issue. AI system performance needs to be framed as a clinical quality issue, not just about contract management. Clinical governance is the vehicle through which relationships can be jointly and meaningfully owned between senior leaders and proactively reviewed with the same rigour as any other quality domain.

A practical action might be the the development of a local AI Register;  maintained in the same way as our drug formularies, listing every deployed AI system, its clinical owner, its performance status, and its contractual terms to support effective quality, contract and relationship management.

Finally…

To conclude, these questions aren’t a model for governance. They’re prompts for the conversation that needs to happen to allow us to govern AI with confidence using the established approaches we use to safely manage other complex tools, pathway and medicines. The questions should help us test whether we’ve understood what governance is for, whether we’ve resisted the temptation to treat AI as exceptional, and whether we’ve been honest about the risks; clinical, professional, and commercial, that come with this new technology. I’m sure there are more and I hope if you’ve read this far you’ll share yours.

The next piece will move from questions to architecture, mapping the governance duties that I think we need to hold in view when working with AI technologies, anchoring these to the clinical governance frameworks that we already have. What we need to do to safely govern AI in healthcare isn’t new, but what we might need to do to discharge this duty perhaps is. 

Leave a Reply

Your email address will not be published. Required fields are marked *