Designing at the Wrong Layer
AI interface design is not failing because designers lack skill. It is failing because designers are being asked to work at the wrong layer.
In 2019, a team at Microsoft did something quietly devastating. They took the field's best interaction design guidelines, the ones that had been settled practice for thirty years, and tested how well they translated to AI systems deployed in the wild.
What the study revealed were fundamental issues in how AI applied those design guidelines. Systems that couldn't explain what they could do or couldn't account for why they made a decision or offered no recovery path when things went wrong. Nielsen's heuristics turned upside down.
The tempting diagnosis is to say "AI is new. We're still learning."
The uncomfortable truth is not that the design guidelines aren't fit for purpose because designers are being asked to work at the wrong layer in an AI system.
The assumption classical UI was built on
I'm going to use the wanky term "classical UI" to describe what most people recognise when they look at a screen on a device.
When Jakob Nielsen published his heuristics in 1994, he was describing a world that had one defining property: the designer controlled the surface being designed.
This seems obvious in retrospect, of course the designer controls the interface. What else would they control? But trace what that assumption actually enables and you start to see how much classical UI theory depends on it.
If you the designer control the surface, you can guarantee what happens when a user clicks here. You can specify the error state. You can inspect every branch of the interaction tree.
Thirty years of UI practice built on what we know as the 'stack': Information architecture, Interaction design, Usability testing. All of this dependent on the designer's control over the full stack.
Three layers classical UI never needed
What AI interface design actually requires, once you stop trying to retrofit classical frameworks, is a set of design responsibilities that have no precedent in UI history. Three of them emerge from the research.
The pre-interaction layer
Caetano and colleagues identified a failure mode in conversational AI that classical UX had never needed to worry about: users arrive at the interface without the capacity to use it well. Not because they lack skill. Because natural language implies open-ended capability that the system doesn't have.
The temporal layer
Classical UI had fixed interfaces. A designer made decisions, they were implemented, the interface remained stable until someone deliberately changed it. Learning was a human activity. It happened outside the system.
AI systems learn. And this creates a tension: learn from user behaviour versus adapt cautiously. The more aggressively a system adapts, the more disruptively it changes.
The structural layer
This is the hardest one to sit with, because it describes the boundary of design's actual authority.
Many of the most important design properties in AI interfaces—explainability, accurate intent sensing, appropriate calibration, recovery from unexpected outputs—are determined at the model layer, not the interface layer.
Code blocks and technical content
For technical essays, we need proper code block styling. Here's an example of how design tokens might be structured:
// Design token structure for AI confidence levels
const confidenceTokens = {
high: {
color: 'var(--ed-accent)',
weight: 600,
showIndicator: true
},
medium: {
color: 'var(--ed-text-muted)',
weight: 400,
showIndicator: true
},
low: {
color: 'var(--ed-border)',
weight: 400,
showIndicator: false
}
}
Inline code also needs testing: the ed-prose class handles
typography, while ed-aside creates callout boxes.
What the field is currently doing wrong
Three failure modes run through the research:
- Scope misalignment. Design is being applied at the screen level to problems that originate at the model level.
- Diversity failure. The design vocabulary for culturally sensitive AI interaction exists. In practice, it consistently isn't applied.
- Boundary ambiguity. There is currently no clear position on where design's authority ends and engineering's begins.
What design needs to do instead
None of this argues for design retreating from AI. It argues for design advancing into territory it hasn't yet claimed.
Practically, three things follow:
- Extend the remit upstream. The pre-interaction layer is a design responsibility.
- Treat the training pipeline as a design artefact. If the interface changes over time because the model is learning, then learning objectives are design decisions.
- Develop a deliberate strategy for the structural layer. Clear escalation paths for design requirements that can't be implemented at the interface level.
Sources
Amershi et al. (2019). Guidelines for Human-AI Interaction. CHI Conference on Human Factors in Computing Systems.
Caetano et al. (2025). Agentic Workflows for Conversational Human-AI Interaction Design.
Blümel & Jha (2023). Designing a Conversational AI Agent. Hawaii International Conference on System Sciences.