Photo of Julien Thibeaut or Ibelick

Anatomy of AI Input

I've been building AI inputs for months now. I call this component prompt-input in prompt-kit, my library of components for AI applications.

When ChatGPT launched in November 2022, the input field was simple, just a textbox and a button to send. Three years later, the input in AI products does a lot more than just send text.

Here's how an AI input actually works, layer by layer.

The input starts with a simple field where the user types a prompt. It should grow to multiple lines and stop at a clear defined max height. It should grow without pushing the rest of the layout or jumping the scroll. Shift+Enter adds a new line, and Enter sends (you can make it configurable in settings).

You can guide the user with a placeholder if you have context, commands, or anything hidden inside. Small details matter, like customizing the caret color to match the brand color.

On mobile, keep the font-size at least 16px to avoid Safari zoom. Or set a proper viewport meta tag to stop the auto-zoom. Keep the typing stable: no shifting lines, no auto-capitalization on mobile, and no cursor jumps (stays where the user expects).

A group of controls that change how the model will answer: model selector, mode (code, search, tools), memory, context, etc. Sometimes you don't need a context bar, in that case don't add it just for the sake of it. Try not to overload the input.

It's usually below the input field, and it should be more discreet than the field itself. Keep only what the user needs often. Everything else should go in a menu or settings. Don't try to put everything here.

Every action in the context bar should be keyboard-accessible, and you can add shortcuts for some actions. On mobile, if space is too tight, labels+icons can become icons only.

If the context bar is too crowded, use tool triggers. They can be in the context bar or inside the input, as a dropdown or a slash command.

Pick the pattern based on how many tools you expose and how often the user needs them. Don't add too much. Keep it simple.

Triggers should be keyboard-first and appear fast. You can add a small interaction to them, for slash commands I prefer to make them instant. And always avoid layout jumps (like the input moving or growing when a trigger opens).

In most AI products, you can add files, images, videos, or audio. It’s a fragile part because many things can go wrong if you don't handle it well. Most of the time it's a button in the context bar. It must cover the basics: uploading, uploaded, failed upload, and removing a file. You also need file constraints: size limits, file types, number of files.

Drag-and-drop matters too: show a clear drag-and-drop highlight when a file enters the input. When a user drags and drops a file, show a preview above the input field, or directly in the chat. Stack multiple files if needed, and make sure the input doesn’t jump.

If there’s an error, show it near the input. Avoid toast notifications. To make it feel fast, use optimistic updates: show the file instantly with a preview, then sync it in the background or when the user sends the message.

Sometime user needs guidance. The assist layer can take a few forms: ghost text inside the field, inline suggestions that appear while typing (like Cursor with file paths), or prompt suggestions/presets above or below the input.

It should help and guide the user but not take over the input. Make it fast, non-intrusive, easy to ignore and usable from the keyboard when needed.

The assist layer is easy to overlook, but it makes the product much better when done right. Suggest when it makes sense for your product and your users. But don't over-suggest.

Don’t disable the controls. Even if you don’t allow sending multiple messages, the user should still be able to type, correct something, or stop the response.

If you have voice mode, you can show the voice button first and switch to a send button when the user starts typing, or keep both visible.

When the user sends a message, the send button should turn into a stop button so they can interrupt the response. Make sure all actions are keyboard-accessible.

We usually have these states in an AI product: idle, loading, streaming, error, success.

In loading or streaming, don't lock the input. The user should still be able to type or stop the response.

For the error, keep the input neutral and usable. Errors belong to the generation, not the typing, so keep them out of the input (chat container, tool, etc.). Avoid toast notifications here too. Errors should provide a way to retry the generation. (like a retry button in the chat container).

The simple rule: keep the input usable in almost every state. The user should interact with the input as much as possible. Responses can be long in AI products, so let the user use the product while they wait.

Edge cases vary from product to product, but the input rules stay mostly the same: predictable and keyboard-first. Typing must stay instant. No lag, no jumps, no layout shifts when the field grows or shrinks.

The input should stay usable even if the model is slow, the network is slow, or the product is still loading something. Never make it read-only while the model is streaming. Don’t steal focus when a response starts or ends. On mobile, handle the keyboard covering the field.

Optimistic updates most of the actions, keep everything fast, don't overdo animations. Clear the input right after sending and switch to the stop button without delay.

A good AI Input looks simple, but hides a lot of details. And every layer matters. As the product grows, resist the urge to overload it. If something can live elsewhere move it, if you can group functionality behind a trigger, do it. There's always a balance between showing too much and not enough. Try to guide your user without overwhelming them.

It's also an element of the product where you can play with the experience and the details. Add that beautiful shader in the background for onboarding, that refined animation when the user sends their first message. Don't forget that people remember great experiences, in a crowded AI market, that can make the difference.