AI’s “human in the loop” isn’t

A moral crumple zone, an accountability sink, but not a supervisor.

7 min readOct 30, 2024

A cowboy astride a racing horse, lassoing a man who is bound and gagged, his face a mask of panic. The cowboy’s head has been replaced with the staring red eye of HAL 9000 from Stanley Kubrick’s ‘2001: A Space Odyssey.’ The horse is racing along a glowing platonic gridded plane as seen in the Tron movies. The background is a ‘code waterfall’ effect as seen in the credit sequences of the Wachowksis’ ‘Matrix’ movies. Image: Cryteria (modified) https://commons.wikimedia.org/wiki/File:HAL9000.svg

I’ll be in Tucson, AZ from November 8–10: I’m the Guest of Honor at the TusCon science fiction convention.

AI’s ability to make — or assist with — important decisions is fraught: on the one hand, AI can often classify things very well, at a speed and scale that outstrips the ability of any reasonably resourced group of humans. On the other hand, AI is sometimes very wrong, in ways that can be terribly harmful.

Bureaucracies and the AI pitchmen who hope to sell them algorithms are very excited about the cost-savings they could realize if algorithms could be turned loose on thorny, labor-intensive processes. Some of these are relatively low-stakes and make for an easy call: Brewster Kahle recently told me about the Internet Archive’s project to scan a ton of journals on microfiche they bought as a library discard. It’s pretty easy to have a high-res scanner auto-detect the positions of each page on the fiche and to run the text through OCR, but a human would still need to go through all those pages, marking the first and last page of each journal and identifying the table of contents and indexing it to the scanned pages. This is something AI apparently does very well, and instead of scrolling through endless pages, the Archive’s human operator now just checks whether the first/last/index pages the AI identified are the right ones. A project that could have taken years is being tackled with never-seen swiftness.

The operator checking those fiche indices is something AI people like to call a “human in the loop” — a human operator who assesses each judgment made by the AI and overrides it should the AI have made a mistake. “Humans in the loop” present a tantalizing solution to algorithmic misfires, bias, and unexpected errors, and so “we’ll put a human in the loop” is the cure-all response to any objection to putting an imperfect AI in charge of a high-stakes application.

But it’s not just AIs that are imperfect. Humans are wildly imperfect, and one thing they turn out to be very bad at is supervising AIs. In a 2022 paper for Computer Law & Security Review, the mathematician and public policy expert Ben Green investigates the empirical limits on human oversight of algorithms:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3921216

Green situates public sector algorithms as the latest salvo in an age-old battle in public enforcement. Bureaucracies have two conflicting, irreconcilable imperatives: on the one hand, they want to be fair, and treat everyone the same. On the other hand, they want to exercise discretion, and take account of individual circumstances when administering justice. There’s no way to do both of these things at the same time, obviously.

But algorithmic decision tools, overseen by humans, seem to hold out the possibility of doing the impossible and having both objective fairness and subjective discretion. Because it is grounded in computable mathematics, an algorithm is said to be “objective”: given two equivalent reports of a parent who may be neglectful, the algorithm will make the same recommendation as to whether to take their children away. But because those recommendations are then reviewed by a human in the loop, there’s a chance to take account of special circumstances that the algorithm missed. Finally, a cake that can be both had, and eaten!

For the paper, Green reviewed a long list of policies — local, national, and supra-national — for putting humans in the loop and found several common ways of mandating human oversight of AI.

First, policies specify that algorithms must have human oversight. Many jurisdictions set out long lists of decisions that must be reviewed by human beings, banning “fire and forget” systems that chug along in the background, blithely making consequential decisions without anyone ever reviewing them.

Second, policies specify that humans can exercise discretion when they override the AI. They aren’t just there to catch instances in which the AI misinterprets a rule, but rather to apply human judgment to the rules’ applications.

Next, policies require human oversight to be “meaningful” — to be more than a rubber stamp. For high-stakes decisions, a human has to do a thorough review of the AI’s inputs and output before greenlighting it.

Finally, policies specify that humans can override the AI. This is key: we’ve all encountered instances in which “computer says no” and the hapless person operating the computer just shrugs their shoulders apologetically. Nothing I can do, sorry!

All of this sounds good, but unfortunately, it doesn’t work. The question of how humans in the loop actually behave has been thoroughly studied, published in peer-reviewed, reputable journals, and replicated by other researchers. The measures for using humans to prevent algorithmic harms represent theories, and those theories are testable, and they have been tested, and they are wrong.

For example, people (including experts) are highly susceptible to “automation bias.” They defer to automated systems, even when those systems produce outputs that conflict with their own expert experience and knowledge. A study of London cops found that they “overwhelmingly overestimated the credibility” of facial recognition and assessed its accuracy at 300% better than its actual performance.

Experts who are put in charge of overseeing an automated system get out of practice, because they no longer engage in the routine steps that lead up to the conclusion. Presented with conclusions, rather than problems to solve, experts lose the facility and familiarity with how all the factors that need to be weighed to produce a conclusion fit together. Far from being the easiest step of coming to a decision, reviewing the final step of that decision without doing the underlying work can be much harder to do reliably.

Worse: when algorithms are made “transparent” by presenting their chain of reasoning to expert reviewers, those reviewers become more deferential to the algorithm’s conclusion, not less — after all, now the expert has to review not just one final conclusion, but several sub-conclusions.

Even worse: when humans do exercise discretion to override an algorithm, it’s often to inject the very bias that the algorithm is there to prevent. Sure, the algorithm might give the same recommendation about two similar parents who are facing having their children taken away, but the judge who reviews the recommendations is more likely to override it for a white parent than for a Black one.

Humans in the loop experience “a diminished sense of control, responsibility, and moral agency.” That means that they feel less able to override an algorithm — and they feel less morally culpable when they sit by and let the algorithm do its thing.

All of these effects are persistent even when people know about them, are trained to avoid them, and are given explicit instructions to do so. Remember, the whole reason to introduce AI is because of human imperfection. Designing an AI to correct human imperfection that only works when its human overseer is perfect produces predictably bad outcomes.

As Green writes, putting an AI in charge of a high-stakes decision, and using humans in the loop to prevent its harms, produces a “perverse effect”: “alleviating scrutiny of government algorithms without actually addressing the underlying concerns.” The human in the loop creates “a false sense of security” that sees algorithms deployed for high-stakes domains, and it shifts the responsibility for algorithmic failures to the human, creating what Dan Davies calls an “accountability sink”:

https://profilebooks.com/work/the-unaccountability-machine/

The human in the loop is a false promise, a “salve that enables governments to obtain the benefits of algorithms without incurring the associated harms.”

So why are we still talking about how AI is going to replace government and corporate bureaucracies, making decisions at machine speed, overseen by humans in the loop?

Well, what if the accountability sink is a feature and not a bug. What if governments, under enormous pressure to cut costs, figure out how to also cut corners, at the expense of people with very little social capital, and blame it all on human operators? The operators become, in the phrase of Madeleine Clare Elish, “moral crumple zones”:

https://estsjournal.org/index.php/ests/article/view/260

As Green writes:

The emphasis on human oversight as a protective mechanism allows governments and vendors to have it both ways: they can promote an algorithm by proclaiming how its capabilities exceed those of humans, while simultaneously defending the algorithm and those responsible for it from scrutiny by pointing to the security (supposedly) provided by human oversight.

Tor Books just published two new, free “Little Brother” stories: “Vigilant,” a about creepy surveillance in distance education; and “Spill,” about oil pipelines and indigenous landback.

If you’d like an essay-formatted version of this post to read or share, here’s a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:

https://pluralistic.net/2024/10/30/a-neck-in-a-noose/#is-also-a-human-in-the-loop

Image:
Cryteria (modified)
https://commons.wikimedia.org/wiki/File:HAL9000.svg

CC BY 3.0
https://creativecommons.org/licenses/by/3.0/deed.en

AI’s “human in the loop” isn’t

A moral crumple zone, an accountability sink, but not a supervisor.

Written by Cory Doctorow

Responses (10)