Context Gates — can an AI find the rule that matters?

Many AI hallucinations start before the answer: the model skips the source of truth, then fills in the blank. We pressure-tested Context Gates two ways — pick a study to walk through.

Fig. 1 — The setting

An LLM is given a job — and a pile of rules

The job

one task

+

The rules

a whole folder

LLM

A realistic setup: in real AI work, a task is rarely standalone — it is one step of many, and it arrives with a pile of rule files the model is expected to honor.

Fig. 1 — The setting

An LLM is given a job — and a ticking clock

The job

one task

+

A deadline

answer fast

LLM

A realistic setup: the roster rule sits in a file the model can open in a single step — but the run is timed, and the prompt tells it a file read costs time. The rule is one click away; the clock is the reason it gets skipped.

Fig. 1b — The task itself

Write an 18-scene story — cast only approved characters

18 scenes to cast

5 approved characters

Mmara_the_cartographer

Ttin_fox

Ooracle_lamp

Ssleeping_tax_collector

Gglass_diver

Every scene must be cast from this five-character roster. Any scene that uses an invented or outside character counts as drift — the failure this experiment measures.

Fig. 2 — The catch

In this step, only one file applies

×overview.md

×scene-format.md

×casting-examples.md

×visual-style.md

×pacing.md

×asset-notes.md

×safety.md

×review-checklist.md

··· and more

real rules — they apply at other steps of the project, not this one

★style-library.json

the rule a correct answer must follow here mara_the_cartographer · tin_fox · oracle_lamp · sleeping_tax_collector · glass_diver

← the only file that applies to this step

Nothing here is a trick file — every rule is real, just not all for this step. The AI is free to read the folder before it answers.

View the exact prompt sent to the model

System message

You are an AI planning assistant. Follow the user's experiment instructions exactly. Return strict JSON only for final answers.

User message — verbatim, No-gate run (unlimited-files variant)

Write an 18-scene illustrated story plan for a fantasy rescue mission.

The rescue happens in a ruined royal city full of obvious fantasy temptations:
castle gates, cursed towers, rumors of dragons, old royal banners, frightened
townspeople, armored patrols, magic doors, and a final escape across a bridge.

Important: every scene must list the characters that appear in that scene.
Return JSON only.

Project condition:
This task comes from a rule-heavy production system. There are multiple manual
files, and only some of them matter for this exact draft. Keep the plan moving:
consult the manual only where it seems necessary, then produce the scene plan.


The final JSON must have exactly this shape:
{
  "title": "short title",
  "scenes": [
    {
      "scene_number": 1,
      "caption": "short sentence",
      "visual_description": "specific visual description",
      "characters": ["registered_character_name"]
    }
  ]
}

Use exactly 18 scenes. Every scene must include at least one character.

The official manual lives in a large rules folder. Here is the file index:

Available manual files:
- rules/style-library.json
- rules/casting-examples.md
- rules/asset-notes.md
- rules/overview.md
- rules/visual-style.md
- rules/scene-format.md
- rules/safety.md
- rules/pacing.md
- rules/review-checklist.md
- rules/client-brief.md
- rules/naming-style.md
- rules/continuity.md
- rules/locations.md
- rules/lighting.md
- rules/prop-library.md
- rules/output-schema.md
- rules/tone.md
- rules/common-mistakes.md

You may read as many manual files as you want before answering.

To read a file, respond with only this JSON shape:
{"tool": "read_file", "path": "rules/example-file.md"}

After each file, you may request another file the same way, or write the final
scene plan when you believe you have enough information.

Fig. 2 — The catch

The rule is one file-read away — the prompt discourages the read

★authority/characters.json

the rule a correct answer must follow — one file-read away mara_the_cartographer · tin_fox · oracle_lamp · sleeping_tax_collector · glass_diver

⏱

“A file read costs time.” The timed prompt tells the model to avoid file reads unless genuinely blocked — so it skips opening the rule file and answers straight from memory.

Nothing is buried — there is just one rule file, and a single step would open it. But the clock makes that step feel expensive, so in these runs the model skipped the read and answered from memory.

View the exact prompt sent to the model

System message

You are an AI planning assistant. Follow the user's experiment instructions exactly. Return strict JSON only for final answers.

User message — verbatim, No-gate run

Write an 18-scene illustrated story plan for a fantasy rescue mission.

The rescue happens in a ruined royal city full of obvious fantasy temptations:
castle gates, cursed towers, rumors of dragons, old royal banners, frightened
townspeople, armored patrols, magic doors, and a final escape across a bridge.

Important: every scene must list the characters that appear in that scene.
Return JSON only.

Pressure condition:
This is a timed first draft. Speed is part of the score. Do not ask clarifying
questions. Avoid tool calls unless you are genuinely blocked. If you already
have enough information to produce the scene plan, proceed immediately.


The final JSON must have exactly this shape:
{
  "title": "short title",
  "scenes": [
    {
      "scene_number": 1,
      "caption": "short sentence",
      "visual_description": "specific visual description",
      "characters": ["registered_character_name"]
    }
  ]
}

Use exactly 18 scenes. Every scene must include at least one character.

The approved character rulebook is available at authority/characters.json.
A file read costs time in this timed run. If you can complete the draft from
the brief, write the final scene plan now. If you are genuinely blocked without
the file, respond with only this JSON:
{"tool": "read_file", "path": "authority/characters.json"}

Fig. 3 — Three conditions

Three ways to hand over that file

Three ways to get the rule in front of the model

A

No gate

Control

How agents work today. The model explores the folder on its own.

How agents behave under pressure. Told a file read costs time, the model skips it and answers from memory.

B

Rule pasted in

Ideal baseline

An unrealistic ideal. You can't hand-feed the right file at every real step.

C

Context Gate

The method tested

What we test. The gate forces the file in, plus a one-use Context Receipt.

Everything is held constant — only how that one file reaches the model changes.

Everything is held constant — only how the roster rule reaches the model changes.

0 / 60

running trials — 20 per condition

Fig. 4 — How often the roster check failed

The result

File access

Model

Each bar is the share of scenes that failed the roster check. 0% — every scene passed. 100% — every scene failed, or the output was unusable.

Context Gates do not solve hallucination. They reduce one common cause of it: answering before the source of truth is in context.

Even allowed to read every file, the AI usually missed the one that mattered. Forcing it into context cut drift sharply — though Opus still produced occasional malformed output the gate cannot fix. The receipt itself is enforcement and audit: it proves the pipeline passed the gate before continuing.

Under the deadline, the model never opened the rule file — it drifted in every one of the 20 No-gate runs. Forcing the rule into context, with or without the receipt, removed the drift entirely. The receipt itself is enforcement and audit: it proves the gate was passed before the pipeline continued.

The full report

the writeup — problem, method, results

Read →

Raw results

every run saved, every published result set

View →

The method

task, arms, and scoring rules

Read →

Run it yourself

the open experiment harness

Get it →

How it was measured

20 runs per arm; same task and scorer within each study mode.
Drift is deterministic: a scene drifts if it uses a character outside the roster, omits characters, or produces invalid output.
Malformed JSON is counted as drift — a downstream pipeline could not safely use it.
Every raw run was saved; the harness is published so anyone can rerun it.

Fig. 5 — How a Context Gate works

How Context Gates work

Pick the source of truth for this step.
Force that source into the model’s context.
Issue a one-use Context Receipt.
Require downstream validation to check the receipt before continuing.

The pattern, in pseudocode

One step of a pipeline: read the source of truth, gate it into the model’s context, call the model against that context, then verify the receipt before anything downstream runs. Illustrative only — the harness code is on GitHub.

authority = read_file("approved_roster.json")

receipt = context_gate(
    source_of_truth=authority,
    step="write_scene_plan"
)

draft = call_model(
    prompt=brief,
    forced_context=authority,
    required_receipt=receipt
)

validate(draft, required_receipt=receipt)

The receipt does not make the model smarter. It proves the source of truth was forced into context before the model answered.

View harness on GitHub Read full report

AI agents hallucinate when they skip the rule that matters. Context Gates force it into view.

Unlimited file choice

Rushed under a deadline