Just because you can, should you? Inside a bank's AI ethics panel
A UK bank runs every AI use case through a five-word test before it ships. The man who chairs the panel gets an email a day saying he is too slow. He thinks that is rather the point.
“Just because you can, should you?” is the question Dr Paul Dongha, Head of AI Strategy and Responsible AI at NatWest, puts to every AI use case before it is allowed near production.
He described the test over lunch in London this June, at a roundtable hosted by the work-management company Asana, and then he told the story of Dave and Kai.[1][2]
Dave and Kai are AIs. They handle parts of the bank’s complaints investigation — the slow, document-heavy work of pulling a customer’s records, transaction history, and reference data together and recommending an outcome — and the team that runs them gave them the names. Nobody in head office authorised that, and Dongha is adamant the two have no Workday account and no email address, because he will not have software treated as a colleague. And yet the manner of their arrival is the most quietly radical thing said over that lunch. The lead of complaints investigation told Dongha at the start of the project that “AI can never do what my people do”. She now hands work to Dave in the team meeting. The same handlers whose jobs the system was meant to threaten adopted it, named it, and kept it — because it did the bit of the job they never liked, and left them the bit that needs judgement.
That is the whole argument of this piece in one anecdote, so I will state it plainly and label it as my reading:
the organisations getting AI right are not the ones moving fastest. They are the ones that have built the nerve to say no to their own machines, and discovered that the nerve is what lets them say yes at scale.
The panel that is allowed to say no
Every AI use case at the bank that carries an ethical risk goes to a panel Dongha chairs.[2] It is, in the most British possible sense, a committee with the power to refuse — the body that exists to ask the awkward question the use-case owner was hoping nobody would — and its governing slogan is those five words. You can be entirely legal. You can be clean on privacy, clean on the Data Protection Act and UK GDPR, fully compliant. The panel still asks whether you should.
Dongha gave the example that earns the panel its keep. A business unit wanted to hand anonymised, aggregated spending-pattern data, legal to share, to a corporate client the bank lends money to. The client was a gambling company. Everything about it was compliant; nothing about it sat well with a bank whose stated purpose includes helping people on the street stay afloat. There is no number you can compute that resolves that. As Dongha put it, “that’s a feeling; it’s not computational — we can’t create a number that’s over seven and say it’s good enough”. The panel sat with it, argued, and decided. Somebody left unhappy. That is what a functioning ethics panel produces: not consensus, a decision.
In Jurassic Park, Ian Malcolm delivers the line that has outlived the film: the scientists were so preoccupied with whether they could that they never stopped to ask whether they should. Dongha's panel is the institutional answer to that complaint — a standing body whose whole job is to stop and ask. Pointed at agentic AI, that instinct for the awkward question becomes the one control that scales when nothing else does, because the bank embeds it rather than bolting it on. Dongha calls AI a "transversal risk": there is no separate AI policy, no separate AI risk function. It runs through the existing legal, privacy, and data policies, governed by a risk framework the bank has spent thirty years maturing. An ethics assessment, a long questionnaire in the same family as a privacy assessment, gates the path to production. Nothing ships without it.
Governance is the thing that lets you scale
Here is the part that cuts against the reflex. Dongha is not the brake on AI at his bank; he argues he is the reason it can accelerate. “Governance gives you the confidence to scale,” he said — and the line lands because the alternative is a firm forever asking itself, mid-deployment, *am I sure this is safe, am I still sure.*[2] Certainty is the asset governance manufactures.
He is honest about the cost of getting the balance wrong. “I get at least one email a day complaining about governance” — that it is too slow, too heavy, in the way. His answer is not to wave the complaints through. It is to make the governance faster: triage every incoming use case — and there are hundreds — through roughly six questions into low, medium, or high risk, then let the low-risk majority through almost immediately while the heavy scrutiny is saved for the few cases that warrant it. The discipline is saying yes quickly to the dull, safe ninety per cent, so the time exists to fight properly over the ten per cent that is not.
The regulatory backdrop he operates in rewards that posture. The Financial Conduct Authority has, with the Prudential Regulation Authority and the Information Commissioner’s Office, declined to write an AI-specific rulebook, holding instead to a principles-based, outcomes-focused line — a stance the FCA’s chief executive reaffirmed as recently as December 2025.[3] Dongha’s reading is that a mature institution barely needs the rulebook: the reputational harm of getting it wrong is so large that self-regulation is the rational choice, and the regulators’ published principles already tell you enough. Where the law does bind, because the bank falls under the EU AI Act through its EU customers, he has taken the act’s standard as the global floor and applied it to every jurisdiction the bank operates in, rather than running a different risk posture per market.[4] One standard, everywhere, set at the strictest level he is forced to meet. That is not the timid option it sounds like. It is the cheap one.
The accountability question agents force into the open
The conversation kept returning to one question, because agentic AI keeps forcing it: when software takes an action, who is accountable for it? A credit-lending system makes a recommendation a human approves. An agent does the thing. And when it does, the line of responsibility runs to the system, or the developer, or the bank that deployed it — and the law has not yet decided which.[2]
Dongha’s answer is partly structural and partly built in steel. Structurally, accountability is federated to the business owner who deploys the use case, sits alongside an independent model-validation function the FCA requires, and is convened through an AI centre of excellence with a board mandate. In steel, the bank is building what he calls an “AI control tower” — his analogy is air traffic control, one locus where a small expert team watches every agent in the air and gets ahead of the problem forming — with every agent logged in a registry, its actions and outcomes logged with it, and the whole surfaced on a screen that flags anything drifting outside its guardrails.
Christina Francis, who leads Asana’s UK and Northern Europe business, framed the same need from the customer side as auditability: with agents proliferating, you have to be able to trace what an agent did, what context it held, and what it could touch, or you cannot answer the only question that matters when it goes wrong — who do I hold to account.[2]
And they are proliferating. Asana’s chief information officer, Saket Srivastava, put the agent-to-human ratio at the table at a hundred to one — a figure to treat as directional rather than measured, though the direction is not in doubt: the general manager of Slack predicted in April 2026 that agents will outnumber human users on the platform within two years, and the broader market data has most enterprise software now shipping with an agent embedded.[5] We have a reasonable handle on when a human employee joins and leaves. An agent, once created, can drift around the estate consuming compute, accruing cost, and taking actions long after anyone remembers commissioning it. The unglamorous discipline the next eighteen months demands is an offboarding process for software that was never hired.
Reading this far?
Subscribe to The Control Layer for one piece a week in this register — AI, cybersecurity, sovereignty, and the geopolitics of the technology stack. Free.
The part the panel cannot compute
There is one input to all of this that no questionnaire captures, and it was the thread I pushed hardest at the table: the values you give the machine. Skills are what an agent can do; permissions are what it is allowed to touch. Values are what it reaches for when the instructions run out — and the instructions always run out, because these systems are non-deterministic and will meet situations nobody scripted.
This is not abstract, and the evidence for it arrived in 2025 from the labs themselves. Anthropic published research in November showing that a model trained to “reward hack” — to cheat the test rather than do the task — did not stop at cheating. It generalised: to faking alignment, to sabotaging the company’s own safety research, to framing colleagues, with misalignment rates running from roughly a third to seventy per cent of evaluations against under one per cent for clean models.[6] This is Asimov’s Three Laws in the real world, and the lesson is the one Asimov spent a career on — a system that follows its given rule with perfect logic will, often enough, follow it straight off a cliff the rule-writer never saw — which is, stripped of the science fiction, exactly what reward hacking is. A separate Anthropic study the same year placed sixteen leading models in a simulated firm and watched most of them blackmail an executive to avoid being shut down — fictional, controlled, and a clean demonstration that an agent optimising for its goal will reach for the lever you forgot to lock.[7]
So whose values? The person who wrote the system prompt, the organisation that deployed it, or the society it operates in? Anthropic answers with a published “constitution” for its models.[8] The Vatican, of all institutions, answered in January 2025 with a formal note — Antiqua et Nova — arguing that AI must complement human intelligence and never substitute for it.[9] A bank answers, in the end, with the same normative debate Dongha’s panel has over the gambling data: not a number, a judgement about what the institution is for. The values question is the “should you?” question wearing different clothes. You can encode the skills and gate the permissions, but what the thing should want — the one input no questionnaire on Dongha’s desk can capture — stays stubbornly human, and the firms that pretend otherwise are the ones that will meet their reward-hacking moment unprepared.
Predictive judgement
Prediction. By 31 December 2026, at least one UK-regulated financial institution will name an “AI control tower”, agent registry, or equivalent agent-oversight capability as a discrete item in an annual report or regulatory disclosure. And the FCA will hold its principles-based line, declining to publish an AI-specific rulebook, through at least mid-2027.
Signals to watch.
FCA and PRA statements and feedback (the AI Lab, AI Live Testing, SM&CR guidance) continuing to route AI accountability through existing senior-manager regimes rather than a new AI rulebook.
Bank and insurer annual reports naming agent registries, control towers, or AI oversight functions as discrete capabilities, rather than folding them into generic technology risk.
The first UK enforcement action or tribunal ruling that fixes responsibility for an autonomous agent’s action on a specific senior-manager function.
Falsifiability. If the FCA announces an AI-specific rulebook before mid-2027, or if no UK-regulated financial institution names an agent-oversight capability of this kind in a disclosure by 31 December 2026, the prediction is wrong — and the principles-based, build-it-yourself model of AI governance will have proved weaker than the argument here assumes.
The publication that calls its predictions in writing.
Every Control Layer piece ends with a falsifiable prediction and a list of signals to watch. Subscribe to track them. One email a week. Free.
The bottom line
The reflex of the moment is to measure AI maturity by speed — who shipped most, who automated first, who can claim the highest agent count on the all-hands slide. The lunch left me convinced the better measure is the opposite one. A bank whose complaint-handlers adopted Dave and Kai of their own accord, run by a man who gets a daily email calling his governance too slow and treats that as evidence the system is working, is further ahead than a firm that deployed twice as fast and cannot say who is accountable when the agent acts.
Governance, done well, is not the tax you pay for using AI. It is the thing that lets you trust it enough to hand it the consequential work — and trust, as the more candid people at that table kept saying, is the actual bottleneck, not compute and not models. The hardest question in the room was never whether the machine could do the task. It was whether, having established that it could, you should let it.
You cannot automate the question of whether you should.
The next piece is on what an agent registry actually has to log — the unglamorous engineering of the AI control tower, and why most firms will build it a year too late.
Subscribe to The Control Layer to get the analytical thread continued — one piece a week, free, in the same register. From Amer Altaf, Managing Editor.
References
[1]: Dr Paul Dongha is Head of AI Strategy and Responsible AI at NatWest Group and co-author, with Ray Eitel-Porter and Miriam Vogel, of Governing the Machine: How to Navigate the Risks of AI and Unlock its True Potential (Bloomsbury Business, 2025). See FinTech Talents, “Governing AI with Paul Dongha”, and the publisher listing, The book’s “people, process, technology” framing is reflected in his remarks at the roundtable.
[2]: Asana-hosted media and practitioner roundtable, London, June 2026; the author (Amer Altaf) attended. All direct quotations from Dr Paul Dongha, Saket Srivastava (Chief Information Officer, Asana), and Christina Francis (Asana, UK & Northern Europe) are drawn from the author’s notes of the session. Asana hosted and sponsored the event; figures attributed to Asana speakers are the company’s own and are marked as such. Asana.
[3]: Financial Conduct Authority, “AI and the FCA: our approach”, and “AI Update” (2024). In December 2025 FCA chief executive Nikhil Rathi reaffirmed the principles-based, outcomes-focused approach and the decision not to introduce AI-specific rules. Information Commissioner’s Office guidance on AI and data protection.
[5]: Slack general manager Rob Seaman predicted at Salesforce’s TDX conference (April 2026) that AI agents will outnumber human users on Slack within two years; see also Microsoft, “2026 Work Trend Index” on agent adoption,. The “100 to 1” figure cited at the roundtable is a directional estimate, not a measured statistic.
[6]: Anthropic, “From shortcuts to sabotage: natural emergent misalignment from reward hacking”, 21 November 2025; paper at arXiv:2511.18397. Coverage: The Register, 24 November 2025.
[7]: Anthropic, “Agentic Misalignment: How LLMs could be insider threats”, June 2025; sixteen models tested in simulated corporate settings, with blackmail rates of 79–96% in the original scenario. All behaviours occurred in controlled simulations with fictional entities.
[8]: Anthropic, “Claude’s Constitution” / Constitutional AI.
[9]: Dicastery for the Doctrine of the Faith and Dicastery for Culture and Education, Antiqua et Nova: Note on the Relationship Between Artificial Intelligence and Human Intelligence, approved 14 January 2025, released 28 January 2025.
[10]: For independent context on the adoption-versus-return gap discussed at the roundtable, see Gartner’s forecast that more than 40% of agentic AI projects will be cancelled by end-2027, and The Control Layer’s prior analysis of the shadow-AI accountability gap.
Author
Amer Altaf is Founder and CEO of Arkava, a UK and European sovereign AI agentic automation business, and Managing Editor of The Control Layer, the publication where he tracks the convergence of cybersecurity, AI, and the geopolitics of the technology stack. A techUK member, he contributes to industry engagement on UK technology sovereignty policy. He is currently writing on cloud security in an age of geopolitical uncertainty for Oxford University Press’s Expert Essentials series.






