Unit 2: How LLMs Actually Work
Lesson at a glance
| Item | Detail | | --------------------- | ------------------------------------------------------------------------------------- | | Suggested length | 3 × 60 minutes | | Recommended placement | Week 2 of AI Fluency | | Prerequisite | Unit 1 complete; signed AI Use Agreement on file | | Materials | Whiteboard, vocabulary cards, one student device per pair, OpenAI/Anthropic tokenizer |
Safety: This unit is whiteboard-and-paper plus a tokenizer demo. No graded AI use yet. The next time students touch AI on graded work, they'll already understand what they're talking to.
Standards & credential alignment
- AI4K12 Big Ideas: Representation & Reasoning, Learning.
- CSTA K-12: 3A-AP-13, 3A-AP-15, 3A-IC-25.
- NIST AI RMF: Map function - system understanding.
Learning objectives
By the end of this unit, students can:
- Tokenize a sentence by hand, then verify their split with an online tokenizer.
- Explain what an embedding is using the "directions on a map" analogy.
- Describe attention - at the intuition level - as the model deciding which earlier tokens matter most for the next one.
- Distinguish training from inference, and base model from instruction-tuned from RLHF-aligned.
- Explain why context windows matter and why "the model forgot" is usually the wrong frame.
- List three structural reasons LLMs hallucinate (and one that comes from training data).
Vocabulary
- Token - Roughly ¾ of an English word; the chunk the model reads/writes.
- Embedding - A list of numbers (a vector) that represents a token's "meaning" as a position in high-dimensional space.
- Vector space / latent space - The "map" embeddings live on. Similar meanings cluster together.
- Attention - The mechanism that lets a model weight which earlier tokens matter most when predicting the next one.
- Transformer - The neural network architecture (2017, "Attention Is All You Need") that powers all modern LLMs.
- Context window - How many tokens the model can "see" at once. Everything outside this window does not exist to the model.
- Parameters - The numbers the model learned. "7B" means 7 billion of them.
- Training - The expensive one-time process of teaching the model. Months, megawatts, millions of dollars.
- Inference - The cheap repeat process of running the model. Milliseconds, watts, fractions of a cent.
- Base model - The raw next-token predictor. Will autocomplete anything, polite or not.
- Instruction tuning - Additional training on (instruction → response) pairs to make the model follow directions.
- RLHF (Reinforcement Learning from Human Feedback) - Additional training where humans rank responses and the model learns the ranking.
- Temperature - A knob from 0 (deterministic) to ~2 (chaotic) that controls how much the model deviates from the most likely next token.
Teacher background
The single mental model that makes the rest of the year click:
An LLM is a function. It takes (a sequence of tokens) and returns (a probability over every possible next token). Everything else - chat, agents, RAG, tools - is a wrapper around that function.
Students will resist this. They'll want the model to be smarter than that. Push back gently and consistently - the surprising thing about modern AI is that the next-token function, scaled to a trillion parameters and trained on a healthy chunk of human text, looks like reasoning. That is the wonder. But it remains a next-token function.
Three places students get confused:
- Tokens vs. words. The model does not see words. It sees tokens. "tokenization" might be 1 token; "antidisestablishmentarianism" might be 7. Show this on a tokenizer.
- Memory. The model has no memory between calls except what you put in the prompt. ChatGPT's "memory" feature is a wrapper that injects past info into your prompt. The model itself is amnesiac.
- The hallucination origin. Hallucinations are not bugs. They are an expected consequence of an objective ("predict the next token") that does not include "and only if you're sure." The model has no native concept of certainty. RLHF approximates one, imperfectly.
Pacing - Day 1 (60 min): Tokens and embeddings
| Time | Segment | Notes | | ----------- | --------------------------- | -------------------------------------------------------------------------------- | | 0:00 – 0:05 | Recap | Three vocab from Unit 1, cold-call. | | 0:05 – 0:25 | Mini-lesson - tokens | Whiteboard the loop. Then live demo: tokenize a paragraph in a public tokenizer. | | 0:25 – 0:50 | Activity - tokenize by hand | Pairs split text by hand, then verify. | | 0:50 – 1:00 | Mini-lesson - embeddings | The "map" analogy. King − Man + Woman ≈ Queen. |
Day 1 - Mini-lesson: tokens (20 min)
Draw on the board:
"The cat sat on the mat." → [The][ cat][ sat][ on][ the][ mat][.]
^ each bracket is one token
Live-demo tokenization at the class's chosen tokenizer (OpenAI's tiktoken, Anthropic's tokenizer, or the open-source tokenizers viewer). Try:
"AI"(1 token)"antidisestablishmentarianism"(4–7 depending on tokenizer)- A line of code (much fewer tokens than students expect)
- A paragraph in another language (many more tokens than students expect - emphasize this; it's why non-English languages cost more in API calls)
Drop the line: "The model never sees letters. It only sees the IDs of tokens. To it, "cat" is the number 3797 and that's the entire sensory experience."
Day 1 - Activity: tokenize by hand (25 min)
Worksheet Part 1. Students predict the token split of five sentences, then check with a tokenizer. Score themselves. Easy and fun - the surprises (e.g., "ChatGPT" is two tokens, (a space) is sometimes its own token) drive the lesson home.
Day 1 - Mini-lesson: embeddings (10 min)
Draw a 2D coordinate grid on the board. Plot:
- "king" at (8, 9)
- "queen" at (9, 9)
- "man" at (8, 1)
- "woman" at (9, 1)
Note the geometry: king − man = queen − woman. The directions in the embedding space encode meaning. Real embeddings live in 768, 1536, 4096+ dimensions, but the principle is identical.
Land it: "Words that mean similar things sit near each other on the map. The model does math on positions, not on words."
Pacing - Day 2 (60 min): Attention, transformers, context
| Time | Segment | Notes | | ----------- | ----------------------------- | ------------------------------------------ | | 0:00 – 0:20 | Mini-lesson - attention | Highlighter analogy. | | 0:20 – 0:35 | Activity - manual attention | Students "be" the model on a paragraph. | | 0:35 – 0:50 | Mini-lesson - context windows | Why "the model forgot" is the wrong frame. | | 0:50 – 1:00 | Discussion | What does this imply for prompting? |
Day 2 - Mini-lesson: attention (20 min)
Show a sentence on the board:
"The trophy didn't fit in the brown suitcase because it was too big."
Ask: "What does 'it' refer to?" (The trophy.) Now change one word:
"The trophy didn't fit in the brown suitcase because it was too small."
Now 'it' is the suitcase. Same sentence structure, opposite reference. Attention is the mechanism that lets the model figure this out. When generating the next word after "it," the model looks back at every previous token and decides which ones matter most. In sentence one, "trophy" gets a heavy weight. In sentence two, "suitcase" does.
The highlighter analogy: imagine reading a sentence with five different colored highlighters. Each "head" of attention highlights different relationships - subject/verb, pronoun/referent, modifier/noun. The model learned to do this from examples. Nobody told it the rules of grammar.
Day 2 - Activity: manual attention (15 min)
Pairs get a paragraph and a fill-in-the-blank version. They mark, with highlighters, which earlier words they used to fill each blank. They've just done attention by hand. Compare across pairs - different students attend differently, and so do different attention heads.
Day 2 - Mini-lesson: context windows (15 min)
Define context window. Common sizes circa 2025–2026:
- GPT-4-class chat: 128K tokens (~96K words, ~300 pages)
- Claude-class: 200K tokens (~150K words)
- Gemini Pro: up to 2M tokens
- A typical local 7B model: 8K–32K tokens
Anything outside the window does not exist to the model. When ChatGPT "forgets" what you said an hour ago, it didn't forget. The earlier tokens fell off the front of the window. Memory features and RAG (Unit 7) are workarounds.
Land it: "The model doesn't have memory. It has a clipboard. Whatever's on the clipboard is its entire universe."
Pacing - Day 3 (60 min): Training, alignment, hallucination
| Time | Segment | Notes | | ----------- | -------------------------------- | ------------------------------------------- | | 0:00 – 0:20 | Mini-lesson - training pipeline | Pretraining → instruction tuning → RLHF. | | 0:20 – 0:40 | Mini-lesson - why hallucinations | Three structural causes + one data cause. | | 0:40 – 0:55 | Activity - hallucination hunt | Pairs catch a hallucination in a live demo. | | 0:55 – 1:00 | Quiz cooldown / exit ticket | Student reflection. |
Day 3 - Mini-lesson: the training pipeline (20 min)
Three stages, each progressively cheaper but more important for usability:
- Pretraining (~$10M–$1B per run). Train the base model on the next-token-prediction task across hundreds of billions of tokens of text. The result is a base model: knowledgeable, but it will autocomplete a question rather than answer it.
- Instruction tuning (~$10K–$1M). Fine-tune on (instruction, ideal response) pairs. Now the model answers when asked.
- RLHF or similar alignment (~$100K–$10M). Humans rank model outputs; the model learns the ranking. This is what makes the model polite, helpful, and safe-ish.
Land it: "The model you talk to is not the raw model. It's the raw model wearing three jackets. Sometimes those jackets slip - that's what a 'jailbreak' is."
Day 3 - Mini-lesson: why hallucinations (20 min)
Three structural reasons:
- Objective mismatch. Training rewards "plausible next token," not "true next token." The model has no internal "I don't know" signal that's separate from "I have low probability for any specific answer."
- Pattern completion. If you ask for "three peer-reviewed studies on X," the form of the answer (Author et al., Year, Title, Journal) is so well-learned that the model fills it in even when it has no actual studies in its training data on X.
- Compression. The model has compressed billions of pages of text into hundreds of billions of parameters. Some details get smeared together. The model can't tell which.
Plus one data reason: 4. Training data has lies in it too. The internet is full of confident wrong answers. The model learned to produce confident wrong answers.
The takeaway for prompt design (Unit 3): never ask for facts the model would have no reason to know precisely. Ask for explanations, structures, drafts, reformulations - things where "plausible" is "useful."
Day 3 - Activity: hallucination hunt (15 min)
Demo: ask the model for citations or specific dates on an obscure topic. Read the answer. Have students Google-verify each citation in real time. Score how many were real, fabricated, or partial. Most classes will land on 0–30% real for a sufficiently obscure topic. This is the unit's punchline.
Differentiation, IEP, and 504 supports
- Read-aloud students: the worksheet is screen-reader-friendly. The whiteboard tokenizer demo can be replicated on a laptop with audio.
- EL students: the tokenizer demo is unusually powerful for EL students - show how their home language tokenizes differently from English. This builds intuition without requiring English fluency.
- Math-anxious students: the embedding lesson is described in geometric terms (positions on a map), not algebraic. Resist the urge to write equations on the board.
Assessment & evidence
- Formative: tokenization activity, manual attention exercise, hallucination hunt scoring sheet.
- Summative: end-of-unit quiz (12 questions, included with paid editions).
What's next
Unit 3 is the unit students remember from this course for the rest of their lives - Prompt Engineering Fundamentals. Now that they know what an LLM is and how it works, they get to learn how to drive it.
