What is terminology lag in AI design?

Terminology lag is the gap between when a new interaction pattern becomes common enough to design against and when the field develops stable shared vocabulary for it. AI design is currently producing new patterns faster than it is naming them, which makes it harder to critique work consistently, transfer knowledge across teams, or build shared standards.

What is an elicitation model in AI design?

An elicitation model is a clarification interface the AI surfaces when it needs the user to make a choice before proceeding. It typically presents multiple-choice options rather than an open text field, because the AI has already inferred the likely range of valid responses and is narrowing the decision space rather than asking the user to supply information from scratch.

What is the difference between streaming response and thought display?

A streaming response renders the final output incrementally as the model generates it, so the user can begin reading before generation is complete. Thought display shows the AI's intermediate reasoning steps — the working process — before or alongside the final output. Both reduce perceived latency, but thought display also increases transparency about how the answer was produced.

Why does shared vocabulary matter in AI design?

Shared vocabulary is what makes design critique portable, standards transferable, and new practitioners orientable without having to rediscover every concept from scratch. When teams invent their own local names for the same patterns, knowledge stays siloed and the field matures more slowly than it should. A glossary like this is a starting point, not a final word.

Journal · May 20, 2026 · 8 min read

20 AI design terms every product designer needs to know.

AI design is producing new interaction patterns faster than it is producing names for them. Here are twenty terms to close some of that gap.

Human-AI designGlossaryReference

Design disciplines generate vocabulary as they mature. Visual design gave us the grid, the hierarchy, the gestalt principles. UX gave us the mental model, the affordance, the friction. Service design gave us the blueprint, the touchpoint, the backstage. Each of these terms did more than label an idea; they created a shared referent that let practitioners argue, teach, and build standards around something that previously had no name.

AI design is producing new interaction patterns faster than it is producing names for them. Designers working on AI products are making dozens of decisions every week about things their field has not yet named, which makes those decisions harder to discuss in reviews, harder to critique consistently, and harder to transfer across teams. The result is what I would call terminology lag: the gap between when a new pattern becomes common enough to design against and when the field develops stable vocabulary for it.

This glossary is a starting point for closing that gap. Twenty terms, weighted toward the patterns that show up most frequently in real AI product work and that most consistently lack a shared name in the teams I have seen working on them.

A note on sourcing: some of these terms are already circulating in the industry and I have borrowed and stabilised the definitions. Others are naming conventions I have developed from practice. Where terminology is genuinely contested or emerging, I have picked the most useful framing rather than the most common one.

Terms for how users input and the AI responds.

Prompt. The input a user provides to an AI system to initiate a response or action. Prompts can be conversational ("help me write this email"), instructional ("summarise the following in three bullet points"), or contextual (a file or screenshot attached alongside a text request). Prompt design — the craft of writing and structuring prompts to get reliable, high-quality outputs — is increasingly its own discipline.

Elicitation model. A clarification interface the AI surfaces when it needs the user to make a choice before it can proceed. Elicitation models typically present multiple-choice options rather than open text fields, because the AI has already inferred the likely range of valid responses. They are distinct from open follow-up questions in that the AI is narrowing the decision space rather than asking the user to supply information from scratch.

Hint chips. Short suggestion labels that appear in the interface after an AI response, offering the user a set of follow-up actions or questions they might want to take. Hint chips reduce the cognitive load of knowing what to ask next and keep users moving through a workflow without having to formulate every next step from scratch. They also function as a form of interface signposting: a way of revealing what the AI can do without requiring the user to discover it through trial and error.

Streaming response. An AI output that renders incrementally as the model generates it, rather than appearing all at once when processing is complete. Streaming significantly reduces perceived latency and allows the user to begin reading and evaluating the response before it finishes generating. From a design perspective, streaming requires decisions about how to handle in-progress states: whether to allow interruption, how to signal that generation is ongoing, and how to handle cases where the user wants to act on an incomplete output.

Thought display. A real-time view of the AI's reasoning or working process before or alongside the final output. Distinct from streaming response, which shows the final output as it generates; thought display shows the intermediate steps the AI is taking to get there, such as which files it is reading, which sub-questions it is resolving, or which tools it is invoking. Claude's extended thinking is one of the more fully realised implementations of thought display currently available.

Direct output. An AI response mode in which only the final result is presented, with no visible working process or intermediate steps. The design trade-off relative to thought display is speed and simplicity against transparency: users see less about how the answer was produced, which reduces cognitive load but also reduces the information they have to evaluate whether to trust it.

Image · pending

Editorial illustration in the warm-dark palette with amber accent. Six small labelled panels arranged like a reference card, each illustrating one of the input and response terms through a simple abstract composition — a prompt arrow entering a box, hint chips as small floating tags, a streaming response as a line of text appearing word by word, thought display as a visible chain of steps. Hand-drawn, editorial line work, no actual UI screenshots. The mood is taxonomic, like a field guide page.

Terms for how AI behaves and where it fails.

Hallucination. AI-generated output that is plausible in form and confident in tone but factually incorrect. Hallucinations occur because language models generate text by predicting probable continuations of patterns, not by retrieving verified facts from a fixed knowledge base. From a design perspective, hallucination is a limit state that should be designed against explicitly: the question is not whether your AI will hallucinate, but what happens to the user experience when it does.

Confidence signal. A visual or textual indicator of how certain an AI is about a given output. Confidence signals range from explicit (a percentage score or a "low confidence" label) to implicit (hedging language like "I believe" or "you may want to verify"). Designing confidence signals well is one of the highest-leverage trust interventions available: under-signalling makes AI seem more reliable than it is, while over-signalling erodes trust in every response, including the accurate ones.

Grounding. The process of connecting AI outputs to verified external sources to reduce the risk of hallucination and improve the traceability of claims. A grounded response does not just make an assertion; it points to the source the assertion comes from. Retrieval-augmented generation (RAG) is the most common architectural approach to grounding: the system retrieves relevant documents before generating a response, so the output is anchored to real material rather than generated from pattern alone.

Source attribution. The surfacing of references or citations alongside AI-generated claims, so the user can verify the basis for an assertion rather than taking it on trust. Source attribution is closely related to grounding but is a UX decision as much as an architectural one: which sources to show, how much of the source to surface, and how to present it without overwhelming the response.

Guardrail. A constraint built into an AI system to prevent it from producing outputs that fall outside a defined scope, whether that scope is defined by safety, legal compliance, brand standards, or product relevance. Guardrails can be implemented at the model level, the system prompt level, or as a post-processing filter on outputs. From a design perspective, guardrail activation — the moment when a user's request hits a constraint — is a limit state that almost always needs a designed response: what does the AI say when it cannot do what the user asked?

Terms for how AI acts in the world.

Agentic loop. A sequence of AI-initiated actions taken autonomously across multiple steps to complete a complex task, with each step informing the next. In an agentic loop, the AI is not waiting for user input between steps; it is making decisions, calling tools, evaluating results, and proceeding based on its own intermediate outputs. Designing for agentic loops requires particular attention to transparency (what is the AI doing?), interrupt affordance (can the user stop it?), and recovery (what happens when a step fails?).

Tool use. An AI's ability to invoke external functions, APIs, or systems as part of completing a task. When an AI searches the web, runs code, reads a file, or sends a calendar invite, it is using a tool. Tool use is what separates an AI that can talk about tasks from one that can actually perform them. From a design perspective, tool use introduces a category of AI action that has consequences beyond the conversation: the AI is not just generating text, it is taking actions with real-world effects that may not be reversible.

Interrupt affordance. A UI element or interaction pattern that allows the user to stop an AI mid-task before it completes. Interrupt affordance becomes critical in agentic contexts, where an AI may be taking a sequence of actions across a longer time horizon, and the user may realise partway through that the AI has misunderstood the request or is heading in the wrong direction. Designing interrupt affordance well means making it visible without being alarming, and clarifying what "stop" means: stop and wait, stop and undo, or stop and explain.

Terms for how AI products are architected.

System prompt. The hidden instruction layer provided to an AI before any user interaction begins. The system prompt shapes the AI's persona, defines its scope, sets its tone, establishes constraints, and provides any context the AI needs to operate in its intended role. It is typically not visible to users but is the primary mechanism through which a product team customises the behaviour of an underlying model. System prompt design is closely related to AI voice design but is broader: it covers not just how the AI sounds but what it knows, what it can and cannot do, and how it should handle edge cases.

Context window. The maximum amount of text, or more precisely the maximum number of tokens, that an AI model can process in a single interaction. Information outside the context window is not available to the AI when generating a response. As products grow in complexity, context window management becomes a real design constraint: what gets included in the context (conversation history, user documents, system configuration, retrieved material) and what gets cut determines what the AI can and cannot know about the current task.

Persistent memory. An AI's ability to retain information from previous conversations or sessions and apply it in future interactions. Without persistent memory, every session starts fresh and the AI has no knowledge of prior exchanges. With it, the AI can personalise responses, avoid repeating questions, and maintain continuity across a longer relationship with the user. Persistent memory introduces design considerations around transparency (what does the AI remember?), control (can the user see and delete it?), and trust (how does the user know what the AI knows about them?).

Terms for how the AI interface works.

Latency affordance. A UI pattern designed to manage user perception and maintain engagement during the gap between a prompt and an AI response. Latency affordances include typing indicators, animated thinking states, skeleton screens that preview the structure of an incoming response, and partial streaming while the rest of the response loads. Well-designed latency affordances reduce perceived wait time and signal that the system is active; poorly designed ones create anxiety or give the impression that something has gone wrong.

Response rating. A binary or scaled feedback mechanism, most commonly a thumbs-up or thumbs-down control, that allows users to signal whether an AI response was helpful or not. Response rating data is used to improve model behaviour over time, but it is also a trust signal in itself: its presence communicates to users that their feedback matters and that the product is actively being improved. From a design perspective, response rating placement, visibility, and timing all affect whether users actually engage with it.

Multi-modal input. An AI interface that accepts more than one type of input, such as text alongside an image, a voice note alongside a document, or a screenshot alongside a question. Multi-modal input expands what a user can communicate to an AI beyond what can be expressed in text alone, and it introduces a design challenge: how to present a single unified input surface that can handle multiple input types gracefully, without making the interface feel overwhelming or unclear.

One counterargument, resolved.

The argument against a glossary like this is that premature standardisation risks locking in the wrong terms before the patterns are stable enough to name reliably. This is a real concern, and it applies to any vocabulary-building exercise in a rapidly evolving field. Some of the terms above will evolve, and a few may be superseded as the field develops better ones.

But the alternative, which is to let teams invent their own local vocabulary for each of these patterns and never compare notes, produces a worse outcome. Shared vocabulary is what makes critique portable, standards transferable, and new practitioners orientable faster than they would be if they had to rediscover every concept from scratch. The terms above are a starting point, not a closing statement. They are meant to be argued with, refined, and replaced as the practice matures.

FAQ

Frequently asked questions

What is terminology lag in AI design?: Terminology lag is the gap between when a new interaction pattern becomes common enough to design against and when the field develops stable shared vocabulary for it. AI design is currently producing new patterns faster than it is naming them, which makes it harder to critique work consistently, transfer knowledge across teams, or build shared standards.
What is an elicitation model in AI design?: An elicitation model is a clarification interface the AI surfaces when it needs the user to make a choice before proceeding. It typically presents multiple-choice options rather than an open text field, because the AI has already inferred the likely range of valid responses and is narrowing the decision space rather than asking the user to supply information from scratch.
What is the difference between streaming response and thought display?: A streaming response renders the final output incrementally as the model generates it, so the user can begin reading before generation is complete. Thought display shows the AI's intermediate reasoning steps — the working process — before or alongside the final output. Both reduce perceived latency, but thought display also increases transparency about how the answer was produced.
What is an agentic loop?: An agentic loop is a sequence of AI-initiated actions taken autonomously across multiple steps to complete a complex task, with each step informing the next. The AI is not waiting for user input between steps; it is making decisions, calling tools, evaluating results, and proceeding based on intermediate outputs. Agentic loops require specific design attention to transparency, interrupt affordance, and failure recovery.
Why does shared vocabulary matter in AI design?: Shared vocabulary is what makes design critique portable, standards transferable, and new practitioners orientable without having to rediscover every concept from scratch. When teams invent their own local names for the same patterns, knowledge stays siloed and the field matures more slowly than it should. A glossary like this is a starting point, not a final word.

Stay in touch

Want to keep talking?

I’m on LinkedIn. Connect, send questions, or just lurk. All welcome.

Connect on LinkedIn ↗Email me