LLMs in 10 Minutes — Chapter I

A Non-Technical Primer for Legal Professionals. What an LLM is, how it processes tokens, what training data means, and three essential questions before using AI output in legal work.

A Non-Technical Primer for Legal Professionals


What is a Large Language Model?

A Large Language Model (LLM) is a type of AI trained on enormous quantities of text — books, websites, legal databases, academic papers, and more. During training, it learns statistical patterns: which words tend to follow which other words, in what contexts, in what combinations.

It does not understand language the way a human does. It predicts it.

Key concept: An LLM is a very sophisticated autocomplete engine. When you type a prompt, it generates the most statistically likely continuation — it does not retrieve information from a database, look anything up, or apply reasoning the way a trained professional does.


The Token, Not the Word

LLMs do not process words — they process tokens. A token is roughly a word-fragment: for example, unenforceable might be split into un, enforce, able. The model predicts each next token in sequence.

This matters to you because:

  • Long prompts may be truncated if they exceed the model’s context window (its short-term memory).
  • Unusual legal terms may be poorly handled if they were rare in the training data.
  • Formatting instructions are token-heavy — keep your prompts focused.

What Training Data Means

The model was trained on text that existed up to a certain date — its knowledge cutoff. It has no access to:

  • Legislation amended after that date
  • Cases decided after that date
  • Your client’s documents, unless you paste them in
  • Internal firm knowledge or your jurisdiction’s unreported decisions
  • Any live database, court portal, or registry

Think of the model as a very well-read junior who finished their reading six to eighteen months ago, has not practised, and cannot check anything. Fluent. Impressive. Needs supervision.


Three Questions Before You Use AI Output

Apply these before using any AI-generated content in legal work:

  1. Is this output fact-sensitive? If yes: verify every specific claim independently.
  2. Does this rely on recent law? If yes: check the current version of the statute or rule.
  3. Is this output going to a client or a court? If yes: full professional review required — not a skim.

Legitimate Uses vs. High-Risk Uses

Good uses of LLMsHigh-risk uses of LLMs
Drafting first versions of standard lettersCiting cases or statutes without verification
Explaining legal concepts in plain EnglishAdvising on specific facts without review
Summarising long documents you have readProcessing confidential client data on free tools
Generating a list of issues to considerTreating AI output as a finished work product
Improving the clarity of your own draftsUsing AI for advice in fast-moving legal areas

Quick Reference Glossary

TermPlain English meaning
LLMLarge Language Model — the underlying AI technology
TokenA chunk of text the model processes (roughly a word-fragment)
Context windowHow much text the model can hold in working memory at once
Training cutoffThe date beyond which the model has no knowledge
PromptThe instruction or question you give the model
HallucinationA confident, fluent, but factually incorrect output

Part of the AI Foundations for Lawyers series — Chapter I.