The seductive, science fictional power of spreadsheets

Maybe the map IS the territory?

Cory Doctorow
10 min readApr 29, 2023

--

A Lotus 1–2–3 spreadsheet with green-on-black, low-res type; its center has an irregular vignette revealing a space station.

This week, John Scalzi was kind enough to let me write a guest-editorial for his Whatever blog about the themes in my new crime technothriller, Red Team Blues; specifically, about the ways that spreadsheets embody the power and the pitfalls of science fiction at its best and worst:

https://whatever.scalzi.com/2023/04/26/the-big-idea-cory-doctorow-2/

Yes, spreadsheets. Marty Hench (the protagonist of Red Team Blues) is a 67-year-old forensic accountant who specializes in unwinding Silicon Valley financial frauds, a field he basically invented 40 years ago, when, as a PC-struck MIT dropout, he moved from Cambridge to San Francisco to recover the stolen millions hidden in spreadsheets.

Working through this book — and its two sequels, which travel back in time to the 1980s and Marty’s first encounters with VisiCalc and Lotus 1–2–3 — I was struck by the similarities between spreadsheets and science fiction.

While many people use spreadsheets as an overgrown calculator, adding up long columns of numbers, the rise and rise of spreadsheets comes from their use in modeling. Using a spreadsheet, a complex process can be expressed as a series of mathematical operations: we put these inputs into the factory and we get these finished goods. Once the model is built, we can easily test out contrafactuals: what if I add a third shift? What if I bargain harder for discounts on a key component? If I give my workers a productivity-increasing raise, will the profits make up for the costs?

These are the questions that anyone managing a complex system asks themselves all the time. Historically, the answers have sprung from intuition, from fingerspitzengefühl — the “fingertip feeling” of how a system’s components work and what their potential and limitations are. But intuition can calcify, become a rigid set of rules that increasingly diverge from the best strategy.

By contrast, spreadsheets yield a set of crisp, instantly tallied answers to any question you put to them. Change the input and watch as that change ripples through the whole system in an eyeblink. If you’re adding three more people to your camping trip, will the amount of additional water require renting another vehicle? No need to guess: just check and see.

This has a lot in common with science fiction, a genre full of thought experiments that ask Heinlein’s famous three questions:

  • What if?
  • If only, and
  • If this goes on…

These contrafactuals are incredibly useful and important. As critical tools, science fiction’s parables about the future are the best chance we have for resisting the inevitabilism that insists that technology must be used in a certain way, or must exist at all. Science fiction doesn’t just interrogate what the gadget does, but who it does it for and who it does it to:

https://pluralistic.net/2023/03/20/love-the-machine/#hate-the-factory

One of science fiction’s key methods comes from sf grandmaster Theodore Sturgeon: “ask the next question.” Ask a question, then ask “what happens next?” Do it again, and again, and again:

https://christopher-mckitterick.com/Sturgeon-Campbell/Sturgeon-Q.htm

This technique produces excellent, critical ways of interrogating technological narratives — check out this delightful example of the possible pipeline from self-driving cars to ransomware gangs to mutual aid societies to the reinvention of the train:

https://dduane.tumblr.com/post/715940904747352064/you-can-make-your-mercedes-ev-go-faster-for-60-a

The commonalities between sf and spreadsheets don’t stop there — sf and spreadsheets share pitfalls, too. A spreadsheet is a model and a model is not the thing it models. The map is not the territory. Every time a messy, real-world process is converted to a crisp, mathematical operation, some important qualitative element is lost.

Modeling is an intrinsically lossy operation. That’s why “all models are wrong, but some models are useful.” There is no process so simple that it can be losslessly converted to a model. Even the actions of the nanoscale transistors in a microchip, which toggle between “0” and “1,” are rarely in a state of “no voltage” and “voltage.” That clean, square-wave line that’s used to describe what happens in a chip is a lie — that is to say, it is a model.

The wave isn’t square, it’s a squiggly line that hovers around zero and around one. Under normal circumstances, “zero” and “zero-ish” is a distinction without a difference. But when computers go wrong, it’s sometimes because a sufficiently ambiguous “zero-ish” acts like a “one.” That’s true all the way up the stack. On engineering diagrams, the nanoscale lines that electrons travel along inside a chip are represented as sharp paths, the kind of thing a Tron-cycle would lay down. But in the real world, we get all kinds of weird effects at that scale — electrons sometimes tunnel through those lines, performing a spooky quantum trick that reminds us that Newtononian physics are also just a model.

Every real-world phenomenon contains qualitative and quantitative elements, but computers can only do math on the quantitative parts. This creates a powerful temptation to incinerate the qualitative and perform operations on whatever dubious quantitative residue is left in the crucible, often with disastrous results.

Remember during lockdown, when a pair of University of Illinois at Urbana-Champaign physicists produced a model of covid spread that predicted that the campus could safely reopen, predicting no more than 500 cases over the entire semester and no more than 100 cases at any one time? The physicists were openly contemptuous of their epidemiologist peers, saying that this kind of model making lacked the “intellectual thrill” of real science.

UI was so swayed by the crisp, precise model that they invited students back to campus — only to shut down again in a matter of weeks, with 780 active cases on campus and more rolling in every day.

The model reduced qualitative factors — like the propensity of undergrads to get drunk, take off their masks, and lick each others’ eyeballs — to a quantitative probability, using the highly precise, scientific technique of taking a wild-ass guess. That guess was wrong. The campus reopening was a super-spreader event.

Any model runs the risk of hiding the irreducible complexity of qualitative factors behind a formula, turning uncertainty into certainty and humility into arrogance.

Think of how we replaced contact tracing with exposure notification. Contact tracing has a qualitative foundation: public health workers establish rapport with infected people, win their trust, and get them to fully enumerate the places they’ve been and the activities they participated in.

By contrast, exposure notification measures whether two Bluetooth radios were within range of each other for a predetermined interval. It substitutes signal strength for a person’s own understanding of their experience. Now, people can be wrong about their own experience — we lose track of time, we misremember emotionally charged events, and so on — but that doesn’t mean we can substitute Bluetooth measurements for personal experience.

That’s why, despite all the clever privacy-preserving math and interesting analysis, exposure notification was a bust, something between a distraction and a false-confidence-generating disaster. Contact tracing ended the 2014 ebola outbreak. Exposure notification just wasted a lot of time:

https://locusmag.com/2021/05/cory-doctorow-qualia/

It’s just too easy to forget which parts of a model are based on guesses and which parts are based on ground truth. And even if you can keep track of those differences, it’s even harder to re-check the model’s ground truth to determine whether the underlying factors have changed. That’s how we got into so much trouble with collateralized debt obligations, which were supposed to be “risk-free” mortgage derivatives that could be safely insured and invested in.

The formulas behind CDO hedging were designed by some of the world’s smartest mathematicians and physicists, who simply assumed that market actors — from loan-originating bank officers to insurance underwriters — would act in reliable, predictable ways. They were so very wrong that they brought the world economy to the brink of ruin:

https://www.wired.com/2009/02/wp-quant/

This is also science fiction’s failure-mode: any science fictional “ask-the-next-question” exercise represents a series of guesses or speculations or maybe possibilities — but when you combine that guesswork with the deceptive certainty that comes from inhabiting a cracking story, it’s easy to mistake “guessing” for “prediction.”

Prediction is hard, especially about the future. The assumptions that go into a prediction are always incomplete, not least because human beings have free will and agency and can change the circumstances that go into the assumptions. The very best science fiction embodies this principle. I’m thinking here of the likes of Ada Palmer, an historian and sf writer whose deep historical knowledge informs her sf and her pedagogy at the University of Chicago:

https://pluralistic.net/2022/02/10/monopoly-begets-monopoly/#terra-ignota

Palmer is famous — even notorious — for her annual four-week undergraduate LARP in which students re-enact the election of the Medicis’ Pope. It’s four weeks of alliances, betrayal and skullduggery by the students, each of whom is enacting the agenda of a real-world Cardinal or other power-broker.

The final investiture is done in full costume at the university’s massive faux-gothic cathedral, and going into that climax, of the four candidates, two are always the same, because the great forces of history are bearing down on that moment to ensure that the champions of the two dominant power-blocs are in the running. But the other two? They’re never the same — because the agency of the actors jockeying for power change the outcome, every single time, in absolutely unpredictable ways.

Like any other model, sf is wrong, but sometimes useful. Thinking about jetpacks and flying cars is “useful” insofar as it gets us to interrogate how we think about cities, about mobility, about privilege and geography. But it’s not a prediction. Worse, the endless tales in which flying cars are presented a fait accompli is a gift to grifters raising money for the objectively stupid idea of flying cars. After all, we all know flying cars are inevitable, so it’s basically a risk-free investment, right? With flying cars just around the corner, wouldn’t it be irresponsible to build a city with mass-transit instead of helipads?

There’s a whole range of thought-experiments that got transformed into predictions and then certainties: self-driving cars, “general artificial intelligence,” infinite life-extension, space colonization, faster-than-light travel, cryptocurrency, etc etc.

Spreadsheets don’t just lead their users astray — they also trick their creators. The very same people who transform wild-assed guesses about hairy, unknowable outcomes into neat mathematical relationships are perfectly capable of acting as if those relationships are based on fact, rather than supposition. The Great Financial Crisis wasn’t just about people who didn’t understand the uncertainty in the hedging algorithm going all-in — the people who made those models were also fooled by them.

It’s very easy to get high on your own supply. I’ll never forget the sf convention panel I was on with Robert Silverberg about sf’s supposed predictive value, where the subject of Robert A Heinlein came up, and Silverberg sniffed, and, in that trademark bone-dry way of his, said, “Ah yes, ‘Robert A Timeline.’”

Sf isn’t just full of writers who mistake their suppositions for predictions — the canon is full of tales in which brilliant people can and do predict the future, with near-perfection. Think of Hari Seldon, the hero of Asimov’s Foundation series, who is able to forecast the future several millennia out. Or Heinlein’s first-ever story, “Life-Line,” in which a genius inventor destroys the insurance industry by creating a computer that can predict your exact date of death using statistical methods.

There’s something wild about this phenomenon, in which writers make stuff up and then assume that anything that cool must also be accurate. One tantalizing explanation for this comes from EL Doctorow’s (no relation) essay “Genesis,” from his 2007 collection “The Creationists”:

https://www.penguinrandomhouse.com/books/41520/creationists-by-e-l-doctorow/

Doctorow tells the history of the Genesis story, which the Hebrews plagiarized from the Babylonians. In Doctorow’s telling, the Babylonian mystics who made up the Genesis story assumed that it had to be true, because they considered themselves to be nowhere near imaginative enough to have come up with something as great as Genesis. An idea that amazing had to be divinely inspired.

I like this because it’s a story of being led astray by humility, rather than hubris.

Imaginative exercises — whether or not they are assisted by mathematical models and self-updating digital spreadsheets — are powerful tools for thinking about the future we want, and to guide our attempts to make that future come true. All models are wrong but some models are useful, of course!

I’m on tour with Red Team Blues now — I’m writing this post while waiting for my flight to San Francisco, where I’m appearing at the public library with Annalee Newitz tomorrow (4/30) at 2PM:

https://sfpl.org/events/2023/04/30/author-cory-doctorow-and-annalee-newitz-conversation-red-team-blues

One especially fun stop on this tour will be on May 5, at the Books, Inc in Mountain View, where I’ll be talking about the book with Mitch Kapor, the creator of Lotus 1–2–3, who knows a thing or two about spreadsheets:

https://www.booksinc.net/event/cory-doctorow-books-inc-mountain-view

The tour is bringing me to Berkeley, Vancouver, Calgary, DC, Gaithersburg, Toronto, PDX, Nottingham, Hay, London, Manchester, Edinburgh and Berlin — I hope to see you!

https://craphound.com/novels/redteamblues/2023/04/26/the-red-team-blues-tour-burbank-sf-pdx-berkeley-yvr-edmonton-gaithersburg-dc-toronto-hay-oxford-nottingham-manchester-london-edinburgh-london-berlin/

Catch me on tour with Red Team Blues in Mountain View, Berkeley, San Francisco, Portland, Vancouver, Calgary, Toronto, DC, Gaithersburg, Oxford, Hay, Manchester, Nottingham, London, and Berlin!

If you’d like an essay-formatted version of this post to read or share, here’s a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:

https://pluralistic.net/2023/04/29/gedankenexperimentwahn/#high-on-your-own-supply

Cory Doctorow (craphound.com) is a science fiction author, activist, and blogger. He has a podcast, a newsletter, a Twitter feed, a Mastodon feed, and a Tumblr feed. He was born in Canada, became a British citizen and now lives in Burbank, California. His latest novel is Red Team Blues, a grabby anti-finance finance thriller about a cryptocurrency heist. His latest nonfiction book is Chokepoint Capitalism (with Rebecca Giblin), a book about artistic labor market and excessive buyer power. His latest short story collection is Radicalized. His latest picture book is Poesy the Monster Slayer. His latest YA novel is Pirate Cinema. His latest graphic novel is In Real Life. His forthcoming books include The Internet Con, a nonfiction book about the swiftest, most effective way to shatter Big Tech’s grip on the internet and seize the means of computation (Verso, September 2023); and The Lost Cause, a utopian post-GND novel about truth and reconciliation with white nationalist militias (Tor, November 2023).

--

--