Against User Error: Introducing the SUI Framework

We spend a lot of time working with open source projects, many of them in security- and privacy-conscious spaces. One of the recurring challenges in this kind of work is that you don’t have the same usage data a lot of commercial products rely on. There aren’t clean funnels, large-scale behavioural metrics, or dashboards telling you where people are dropping off or making mistakes. And even when some of that data exists, it’s often partial, limited, or opt-in.

Yes, you talk to people, read through git issues, and engage with the community. But you still need a way to drill down on the interface itself. To look at a screen, a flow, or even a single interaction and say, with some level of confidence, “this is where the problem is.”

One of the ways we’ve managed to do this over the years is through heuristic audits. There are different heuristic frameworks you can use for that kind of work, but the Nielsen Norman heuristics are probably the best known and have been around for decades. They don’t just give you a way to talk about usability problems, they define a standard for what good usability is supposed to look like. You take those ten heuristics, test real screens and journeys against them, and where something falls short, it gets flagged.

From a usability point of view, nothing may be obviously broken. From a security point of view, the interface is still creating risk.

But when we started working on more security-sensitive systems, it began to feel like usability alone wasn’t enough. You could have something that would pass a conventional usability audit and still create situations where people behave in unsafe ways. A warning that gets ignored. An action whose consequences aren't fully clear. A flow that encourages people to rush or guess. From a usability point of view, nothing may be obviously broken. From a security point of view, the interface is still creating risk.

Instead of starting with a theory of what secure UX should look like, we started with the failures themselves — looking at real mistakes and asking what conditions made them possible. We’ve been applying that way of thinking across some of the projects we work on, and out of that a new framework has started to emerge: SUI — Secure User Interface — a set of heuristics for identifying where interfaces tend to fail at the human level.

The framework is organised around three pillars: cognitive, inclusive, and technical. These aren’t a checklist or a scoring system. They’re a way of categorising the kinds of failures we’re looking for. Each pillar contains a set of specific heuristics: named, defined failure conditions we look for during an audit.

Pillar	Description
Cognitive	Examines how mental load, attention, and decision-making affect security. Identifies risks like Alarm Fatigue, where users desensitize to overused urgency signals.
Inclusive	Ensures the system is safe and comprehensible for users with different abilities. Audits for Semantic Integrity, ensuring that what is presented visually matches the underlying code.
Technical	Evaluates how system behaviors (like errors, latency, and state changes) are exposed through the interface.

A brief aside…Enter Steven Michael Casey

While pulling this together I came across a book that turned out to be a perfect, if unlikely, companion text.

Set Phasers on Stun (not to be confused with the 50 year Star Trek Retrospective) is a collection of stories about real technological disasters written in the style of, I don't know, airport fiction. Radiation overdoses, ships running aground, stock market chaos. It's kinda morbid and a bit hokey at times but it's a lot of fun. Casey's approach is to reconstruct each incident as a short scene and then more or less leave you to draw your own conclusions, which is either a principled pedagogical choice or a way of avoiding having to explain anything, depending on how charitable you're feeling. In the prologue he's pretty explicit about it: "it is the driver of the car, not the passenger in the back seat, who learns his way."

Yeah... alright.

Look, it's not especially rigorous. But the stories are broadly true, and I find the whole enterprise rather charming. So, to spice up this dry as a bone explanation of a security-critical heuristic analysis I'm gonna use a few of Casey's examples — one from each pillar — to show how the SUI framework should be used in practice. Each story is a case study in system failure. The SUI heuristics are how we name what caused it.

Set Phasers on Stun

The Cognitive pillar

The cognitive heuristics are concerned with how mental load, attention, and decision-making affect security outcomes — where complexity or ambiguity causes users to rush, guess, or misunderstand what the system is doing.

Casey's Never Cry Wolf chapter opens on “a pleasant mid-July morning outside the medium-security women's prison in Oregon,” where Diane Downs is standing in the recreation yard, eyeing the two fences between her and the outside world. She's already figured out her way through. The motion detectors on the second fence had been triggering constantly for weeks — birds, wind in the weeds, anything. “Hardly a day had gone by in the last week that the alarm had not sounded.” So when Downs climbed the first fence, crossed the median, and triggered the sensor on the second, the alarm went off exactly as designed. “The deafening blare from the speakers washed over the prison yard.” And the guards, conditioned by weeks of false alerts, didn't move. Casey then ends the chapter with the kind of sentence only he would write: "Diane Downs, incarcerated for shooting her three children, walked calmly across the grass to freedom."

...excuse me?

Casey constructs the entire chapter like a little escape thriller and then slots in this little detail about the three children in the final line, as a subordinate clause, like it's a minor biographical note. It's the literary equivalent of the Shawshank Redemption ending with Andy and Red embracing on the beach and then a little caption popping up saying that both of their names were found in the Epstein files. Roll credits

C-2 — Afford Criticality asks whether an interface makes the real severity of a situation clear through the way it presents and escalates warnings. Put simply, when everything is urgent, nothing is. If a system sounds the alarm too often, or for things that don’t matter, users stop treating its warnings as meaningful. In a security context that matters because alarm fatigue doesn’t just create annoyance, it trains people to ignore genuine risk. What happened in that yard was the end point of a much longer failure. The alarm had been sounding for weeks, and the guards had learned exactly what the system taught them: not to trust it.

The Inclusive pillar

The inclusive heuristics focus on whether a system remains understandable and usable across different abilities, tools, and contexts. If users can’t perceive a warning, interpret a signal, or operate a control, they can’t act safely. What looks like an accessibility problem is, in practice, also a security problem.

Casey's Silent Warning chapter follows a doctor driving through the night to a clinic in northern Iraq, piecing together the conditions that led to a mass poisoning. A shipment of wheat had been treated with methylmercury fungicide to prevent spoilage — standard practice, genuinely dangerous. “The label contained a strong warning that the grain could not be consumed or milled into flour, but it was printed in English, not Arabic. A large image of a skull and crossbones sat above the text to emphasize the point.” The Kurdish farmers receiving the grain didn't read English. Most didn't read Arabic either. “The large skull and crossbones symbol on each tag was nothing more than a peculiar piece of art work.” The warning existed, technically. In practice it communicated nothing. By the end of the chapter Casey takes us to the clinic: “He walked quietly to one and gently opened her contorted and now useless hands. Faded traces of red dye lined the creases of her palms.”

…What?

Casey writes this like he's describing someone who got a bit of paint on them doing a craft project. He ends the chapter on a single image of a woman's hands and then just stops. No death toll, no outrage, no summary. The chapter covers a mass poisoning — thousands of people, many of them dead — and he closes on the red dye in the creases of her palms. It's a goddamned fatality screen and he delivers it like a weather report.

I-5 — Semantic Integrity asks whether the signals an interface uses to communicate actually work for the people receiving them. When they don’t, warnings fail, controls are misunderstood, and safety-critical information doesn’t reach the people who need it. The designers assumed a shared language; that the skull and crossbones would be understood, that the label would be legible. The were wrong on both counts.

The Technical pillar

The third pillar is technical — but not in the usual sense.

The technical heuristics aren't about whether the system is secure at the code level. They're about whether the interface accurately reflects what the system is doing — whether errors are recoverable, whether state is visible, whether the user has what they need to act correctly.

I really saved the best one for last here, this final example is the one that gives the book its title: Set phasers on Stun. A radiotherapy technician named Mary Beth makes a small typing error — hits “x” instead of “e” — and corrects it immediately using the edit function. Routine. Except that particular sequence of keystrokes, entered in under eight seconds, had never been tested. It retracted the metal plate that converts the beam to a safe x-ray, but left the power on maximum. “Her computer screen showed that the machine was in the necessary ‘electron beam’ mode, but it was actually now in a debased operating setting.” She fired the beam. Her screen then displayed “Malfunction 54,” which she interpreted as meaning the treatment hadn't been administered. So she reset the machine and tried again. Ray Cox, lying on the table, was hit three times. “Before his death four months later, Ray Cox maintained his good nature and humor, often joking in his east Texas drawl that 'Captain Kirk forgot to put the machine on stun.”

The whole scene is almost farcical — Ray on the table, Mary Beth behind the glass, the machine firing, nobody knowing what's happening. And then Casey sneaks "before his death four months later" into the opening clause of the final paragraph, like a footnote, and moves straight on to the references. The man died tragically... Anyway.

T-8 — Reconstruct Failure States asks whether an error gives the user enough information to recover safely. If it doesn’t, people fall back on guesswork: they retry, improvise, or reach for workarounds. “Malfunction 54” told Mary Beth nothing about what had gone wrong, what the machine had actually done, or what a safe next step looked like. So she did the only thing that made sense with the information she had: she tried again.

T-9 — Communicate State Transition asks whether the system makes its own actions visible. When a system changes state, the user needs to know that it happened. Here, the machine had fired, but the interface suggested it hadn’t. The action was real, but the feedback was false. Together, those two failures removed the safer path — stopping, escalating, not retrying — and made the unsafe response the only rational one.

Beyond user error

The SUI framework is designed to be a complement to research, security testing, and code review, not a substitute for them. What it does is give you a structured way to look at an interface and ask whether the conditions for a predictable human failure are already in place.

It's a kind of institutional gaslighting — the design created the conditions for failure, the user failed, and then the finger points at the user.

Because when those conditions are there and someone fails, the instinct is to call it user error and move on. It's a kind of institutional gaslighting — the design created the conditions for failure, the user failed, and then the finger points at the user. “You were warned.” “You should have been more careful.”

The reason that instinct is so persistent is that it's cheap. It closes the ticket. It explains the incident without requiring anyone to look at the interface again. The guards weren't paying attention. The farmers should have asked someone. Mary Beth should have stopped and escalated. Each of those explanations is technically available, and each of them lets the design off the hook.

What the SUI framework tries to do is make that explanation harder to reach for. When you can't say "the user should have known better," you have to ask what the interface taught them and that's a harder question, but it’s one you can actually do something about. It's also the one that puts the responsibility back where it belongs, with the people who built the thing.