Genevieve CraigGenevieve Craig

Journal · May 18, 2026 · 9 min read

Why some AI feels like a colleague and others feel like a command line.

The difference between AI that feels like thinking alongside something and AI that feels like issuing commands to a system is almost entirely a design decision — and designers are usually not the ones making it.

Human-AI designAI voice designUX

A friend told me recently that their company had switched LLMs, moving from one provider to another as part of an infrastructure change they did not have a say in. The content of what the AI did was more or less the same. The volume of requests, the kinds of tasks, the overall workflow — essentially unchanged. But the experience of using it every day had shifted in a way my friend found difficult to put down but could not stop noticing.

The previous LLM, they said, felt like talking to someone who was actually in the room. Collaborative. Responsive to the specific thing they had asked, not just the category of thing they had asked. When something was ambiguous, it flagged the ambiguity and offered a direction rather than simply picking one. When they pushed back on something it had produced, it engaged with the pushback rather than restating its original answer in different words. It felt, in a word that is easy to dismiss but is also accurate, like thinking alongside something.

The new LLM felt like issuing commands to a system. Capable, fast, technically competent. But the interaction had a different character: request in, output out, no particular sense of the AI as a participant rather than a processor. Nothing wrong with it exactly. But unmistakably different in ways that accumulated across a working day into something that genuinely affected how much my friend wanted to use it.

The reason for that difference is almost entirely a design decision. And in most organisations building AI products, it is a decision that designers are not the ones making.

What changed about UX copy when AI became the interface.

In traditional software design, UX copy was finishing work applied over the top of an already-functioning system. Buttons had labels, error states had messages, empty states had copy to explain what the user should do next. All of that mattered. But the copy was not the primary vehicle through which the product communicated; the UI was. The copy annotated the interface.

In AI-first products, that relationship inverts. The AI's language is often the interface. There is no button to push that does the thing; there is a conversation that produces the thing, and the quality of that conversation — including its vocabulary, its tone, its confidence signals, its error recovery, its proactivity, its pacing — determines in large part whether the product feels trustworthy and worth returning to, or capable but frustrating to work with over time. The copy stopped being annotation and became the primary medium of the experience.

This is what I mean by AI voice design: the discipline of crafting how an AI communicates, as a core UX responsibility, separate from the visual interface and distinct from the engineering work of prompt architecture. It is the practice of deciding, deliberately, what the AI sounds like, how it handles uncertainty, what it volunteers versus what it waits to be asked, and how it recovers when things go wrong. Not as a single decision made at the start of a project, but as an ongoing, iterated design artefact that evolves as the product evolves and as you learn more about how users actually interact with it.

Why designers usually are not the ones doing it.

By default, AI voice design tends to land with whoever writes the system prompts. In practice that is usually the engineering team or a PM working closely with engineering, because the system prompt is a technical artefact and the workflow for producing it sits naturally in the same pipeline as the rest of the product configuration. The decision gets made, the prompt gets written, and by the time a designer sees the interaction in testing or in a design review, the voice is already baked in and "this is just how the AI works" has quietly become a constraint.

The result is often a voice that is technically functional but that no one would describe as well-designed. Mechanical, over-literal, repetitive in its sentence structures, prone to hedging in ways that erode trust, occasionally over-apologetic in ways that feel obsequious rather than helpful. Not wrong exactly, but not designed.

This happens not because PMs and engineers do not care about user experience, but because voice design is a distinct craft and the instincts required to do it well do not develop automatically from product or engineering work. The same way a PM would not typically be expected to spec component states or design a navigation hierarchy, they should not be expected to intuitively know that an AI saying "Great! I will get right on that!" to every request will, after a week of daily use, make users feel like they are talking to a customer service bot rather than a capable tool.

If we check the AI products that feel genuinely well-designed in how they communicate, I would wager that in most cases a designer was at the heart of crafting the tone and the interaction model. And if we check the ones that feel mechanical and cold, I suspect we would find that voice was treated as an engineering configuration rather than a UX problem. No criticism to the teams involved; it is a structural default, not a failure of individual judgment.

Image · pending

Editorial illustration in the warm-dark palette with amber accent. Two side-by-side conversation panels showing the same simple user request arriving at each. Left panel: the AI response feels warm, the text rendered with slight organic variation suggesting personality and expressiveness. Right panel: the same request met with crisp uniform text and a slightly colder layout, mechanically correct but tonally flat. Both panels are abstracted, no real product UI. A designer's hand is visible in the gutter between them, drawing a small annotation arrow with a note. The mood is observation, not judgment.

Ten things that belong in your AI voice design brief.

These are the decisions that shape how an AI sounds. Each one is a design decision rather than a technical default, and each one benefits from a designer being in the room when it gets made.

Formality register. Should the AI sound like a trusted expert colleague, a neutral professional tool, or something warmer and more conversational? This decision should be grounded in who the user is and what stakes they are operating under. An enterprise compliance tool should sound different from a creative brainstorming assistant, even if both are running on the same underlying model.

Response length calibration. When should the AI answer in one sentence and when should it go long? The default behaviour of many models, which is to answer comprehensively by default, can make a conversational AI feel exhausting to use daily. The calibration should be a design decision, not something inherited from the base model.

Uncertainty and error language. This is the highest-stakes moment in any AI interaction, and the most commonly under-designed. When the AI does not know something, is uncertain about its answer, or genuinely cannot help — what does it say, and how? Under-hedging produces an AI that sounds overconfident and erodes trust the first time it is wrong. Over-hedging produces an AI that qualifies everything into uselessness. The calibrated middle is a design choice.

Proactivity calibration. Does the AI volunteer relevant information the user did not ask for? Does it ask follow-up questions to clarify intent? Does it flag assumptions it is making? Each of these shapes whether the AI feels like a participant in the work or a passive executor. The right answer varies by product and by user context, but the default, which is usually to execute and wait, is rarely the optimal one.

Error recovery language. When the AI produces something wrong and the user corrects it, how does it respond? Flat acknowledgment and moving on can feel dismissive. Over-apologising can feel sycophantic. The tone of recovery is one of the strongest signals about whether an AI product was designed as a product or assembled as a system.

Pronouns and self-reference. Does the AI say "I"? Does it use "we"? Does it avoid self-reference entirely? These choices are subtle but they compound over an interaction, and across thousands of interactions, into a strong and consistent impression of whether the AI has a perspective or is simply processing requests.

Hedging density. The frequency and positioning of phrases like "I think," "it is possible that," and "you may want to verify" shapes how confident the AI sounds. A designed hedging strategy gives users accurate signals about AI certainty without undermining every response with blanket qualifications.

Structural pacing. Is the AI's default to reply in prose, in bullets, in numbered steps? Does it vary the structure based on the nature of the question? Structure is a voice decision as much as word choice, and a product that always replies in bullet lists will feel different in character from one that answers in flowing paragraphs, regardless of what the words themselves say.

Domain vocabulary alignment. Does the AI use the language the user uses, or does it translate into its own register? For vertical and enterprise software especially, matching the user's terminology rather than defaulting to generic language is one of the clearest signals that a product was designed for that user, not merely adapted for them.

Proactive disclosure of limitations. Does the AI tell you unprompted when it is operating at the edge of its capability? Or does it quietly produce output that looks confident until you check it against reality? The decision about when and how to surface limitations is a trust design decision with significant long-term voice implications, and it belongs in the design brief, not the engineering backlog.

One counterargument, resolved.

The counterargument is that what I am describing is really just prompt engineering, and that a capable PM or engineer can cover it by writing careful system prompts. This is partially true: the system prompt is the primary lever, and writing it well does require care and craft. But prompt engineering is a technical practice concerned with getting a model to behave correctly; AI voice design is a UX practice concerned with how that behaviour lands for the person on the other side of the conversation.

The distinction matters in the same way that frontend engineering and visual design are distinct disciplines even though both produce what appears on screen. An engineer can write clean, correct code that produces an interface that is technically functional and visually uncomfortable to use. The craft of making it genuinely good is something else. AI voice design is the craft of making the AI's communication work as a human experience, not just technically, and that craft belongs in the same hands as the rest of the UX work.

FAQ

Frequently asked questions

What is AI voice design?
AI voice design is the discipline of crafting how an AI communicates, as a core UX responsibility. It covers formality register, uncertainty language, error recovery, proactivity calibration, and the structural choices that together determine whether an AI feels like a capable collaborator or a mechanical processor. It is distinct from prompt engineering and from visual UI design.
Why does AI voice feel so different between products?
Because voice is the result of deliberate design decisions, and those decisions are made more carefully in some products than others. Teams where designers are involved in crafting the interaction model and tone guidelines ship AI that sounds designed. Teams where voice defaults to whatever engineering produces from the base model ship AI that sounds assembled.
Is AI voice design just prompt engineering?
They overlap but are distinct. Prompt engineering is a technical practice: getting the model to behave correctly. AI voice design is a UX practice: getting the behaviour to land well for the human on the other side of the conversation. Both matter; only one typically has a designer's hand in it, and that asymmetry shows up in the product.
Where should a designer start if their AI product currently sounds mechanical?
Start with the three highest-stakes moments: how the AI handles uncertainty, how it recovers when corrected, and how it responds when it cannot help. Those three moments do more trust work than any amount of friendly phrasing on the happy path, and they are the most commonly under-designed.
Which teams typically end up owning AI voice design by default?
Usually engineering or a PM working closely with engineering, because the system prompt is a technical artefact that sits naturally in the engineering workflow. The result is often a voice that is functional but not designed. Moving voice design into the UX remit — treated as an ongoing artefact rather than a one-time configuration — is the structural fix.

Stay in touch

Want to keep talking?

I’m on LinkedIn. Connect, send questions, or just lurk. All welcome.