Photo of Julien Thibeaut or Ibelick

AI Glossary

An interactive way to understand the core concepts behind artificial intelligence.

The smallest unit of text a model processes. It can be a word, part of a word, or even a symbol.

The
#12345
dragon
#67890
rests
#24680
in
#179
agony
#2464
.
#12
Text is split into small units the model can read. Each token is linked to a number, it’s like its own ID.

The process of breaking text into small units (tokens) a model can understand. Each token can be a word, subword, or character.

Havethecouragetofollowyourheartandintuition
OpenAI Tokenizer, interactive tool to visualize text tokenization
When you send a text to a model, it’s the very first step before anything else happens.

The model turns each token into numbers that represent its meaning. Tokens with similar meanings have vectors (points positioned in space) that are close together.

the
neutral
button
elements
slides
actions
when
neutral
hovered
states
and
neutral
the
neutral
layout
elements
adapts
actions
on
neutral
mobile
states
devices
elements
Wikipedia's 'Vector space', explanation of vector spaces
Turns tokens into points in space, grouped by meaning. It happens right after tokenization.

The limit of how much text a model can consider at once. It reads and reasons only within this window, measured in tokens.

The model can process a limited number of tokens at once. It varies a lot depending on the model.

An internal map where the model organizes what it has learned. Each point represents a concept, and similar ideas group close together.

light
shadow-sm
contrast
glow
reflection
brightness
shape
structure
balance
proportion
symmetry
geometry
texture
grain
surface
detail
material
roughness
color
tone
hue
gradient
saturation
palette
motion
rhythm
flow
timing
movement
pulse
depth
scale
distance
perspective
volume
dimension
layout
alignment
hierarchy
framing
spacing
ratio
warmth
clarity
focus
paper
glass
fabric
metal
stone
clay
tone
silence
resonance
rhythm
echo
frequency
calm
tension
serenity
focus
joy
silence
tool
hand
process
repetition
mastery
patience
idea
thought
abstraction
symbol
meaning
intent
stone
water
wind
light
growth
decay
echo
trace
layer
time
fragment
recall
warmth
stillness
tension
empathy
rhythm
precision
patience
hand
repetition
flow
simplicity
clarity
balance
detail
reduction
type
weight
rhythm
kerning
leading
grid
harmony
dissonance
balance
emphasis
unity
edge
corner
curve
bevel
chamfer
fillet
story
intent
context
metaphor
theme
tempo
duration
cadence
loop
drift
matte
gloss
patina
polish
wear
Each dot is an embedding, placed near others with similar meaning. It’s how the model organizes concepts to relate them efficiently.

A network of connected layers that learn from examples. Each layer refines the data, and together they learn patterns used to recognize images, understand language, or process sounds.

What's the capital of France?
Paris
TensorFlow Playground, visual demo of how neural networks learn through layers
Each layer transforms the information a bit, finding patterns and meaning. So by the end, the network can turn a question into the right answer.

Values the model learns during training that determine how strongly different parts of the network connect and respond. Together, they define how the model understands and generates information.

Understanding Model Parameters: 8B vs 70B Explained, explanation of model parameter counts
Each dot represents a learned value that shapes how the model understands data. More parameters typically mean a smarter, more flexible model (not always true).

A system that has learned from data and can now use that knowledge to predict, generate, or understand new information.

Model
models.dev, directory of AI models and tools
A neural network trained on tons of examples so it can predict or generate new things.

A type of neural network that looks at every word in a sequence at once. Unlike earlier models that read step by step, it learns how words relate across the whole text, allowing it to understand context much more effectively.

Why
does
the
sky
turn
orange
when
the
sun
sets
Attention Is All You Need (Vaswani et al., 2017)
Understands relationships between words across a whole sentence. Like a super-fast reader.

A mechanism inside Transformers that decides which words to focus on when processing a sentence. Each word looks at others and assigns more weight to the ones that matter most for understanding.

Why
does
the
sky
turn
orange
when
the
sun
sets
Helps the model decide which words to focus on for meaning. So it doesn’t treat all words equally but picks out what really matters.

The first learning stage where a model trains on vast text data to learn patterns, context, and general knowledge.

dataset
pattern
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
pattern
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
syntax
semantic
knowledge
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
syntax
semantic
knowledge
generalize
token
dataset
pattern
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
pattern
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
vector
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
embedding
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
gradient
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
syntax
semantic
knowledge
context
sequence
probability
attention
feature
signal
batch
layer
activation
noise
objective
corpus
sample
loss
metric
iteration
alignment
concept
meaning
structure
syntax
semantic
knowledge
generalize
token
Helps the model adapt more quickly and effectively to specific tasks later without starting from scratch.

Training a pre-trained model on new, specific data so it adapts to a particular task or tone. It keeps what it already knows but learns to apply it in a focused way.

the
color
of
this
image
is
good
Teaches the model a new skill without forgetting what it already knows. Here, it’s adapting to design vocabulary.

A training method where the model improves through feedback. It tries actions, receives rewards or penalties from humans or another model, and learns to make better decisions over time.

Improves the model through trial, error, and feedback until it gets better results.

Step-by-step reasoning the model writes to reach an answer. It helps the model break complex problems into smaller, more manageable steps.

How wide should a poster be if it follows the golden ratio and its height is 60 cm?
The golden ratio ≈ 1.618.
Width = height × 1.618.
Shows how the model thinks through a problem before answering.

The stage where a trained model uses what it has learned to generate a response. It predicts the next token step by step until the answer is complete.

Why do leaves change color in autumn?
Basically what’s happening behind the scenes when you use an AI product.

A method that lets a model look up information before answering. It retrieves relevant data from external sources, then uses that context to write a more complete answer.

Query
Documents
Model
Answer
When you use an AI that can search the web, it lets the model pull fresh info before it writes an answer.

Agents are autonomous systems that use tools and feedback loops to accomplish tasks.

Agent
Goal
Plan
Actions
Environment
Tools
Memory
Building effective agents, Engineering at Anthropic
They choose their own actions to get things done.

A predefined sequence of steps where each stage uses the previous result to move the task forward toward a final outcome.

Collect
Process
Generate
Deliver
Connects steps into a clear, predictable path.

A very large neural network trained on vast text data to understand, predict, and generate human language.


References

I built this guide to clarify the concepts behind the tools we use every day.

Thanks to Fflur Page for the help. If you build AI products and need help with the interface, feel free to reach out. You can also explore prompt-kit, the core building blocks for AI apps.