Claude Code remembers things between sessions. That's been true for a while — you can save notes to MEMORY.md, and the next conversation picks up where you left off. The problem is what happens after 50, 100, or 900 sessions. Memory files bloat. Entries contradict each other. Relative dates like "yesterday we switched to Redis" become meaningless a month later. Your AI assistant's long-term memory slowly decays into noise.
Anthropic's answer is AutoDream — a background sub-agent that consolidates Claude Code's memory files while you're not using it. The naming is deliberate: it's modeled on how human brains consolidate memories during REM sleep. And the research it builds on suggests this approach can cut inference costs by 5x.
How AutoDream Works
AutoDream triggers automatically when two conditions are both met: at least 24 hours have passed since the last consolidation, and more than 5 sessions have occurred. Neither condition alone is enough — a single long session over two days won't trigger it, and ten quick sessions in two hours won't either.
When it activates, it runs a four-phase cycle:
Phase 1 — Orient. The sub-agent scans the memory directory and reads MEMORY.md to map what currently exists. It skims existing topics to avoid creating duplicates later.
Phase 2 — Gather Signal. This is where it gets interesting. AutoDream searches through recent session transcripts (stored as JSONL files), but it doesn't read them exhaustively. It uses targeted grep-style searches looking for specific high-value patterns: user corrections, explicit save requests, recurring themes, and important decisions. The system prompt is explicit about this: "Don't exhaustively read transcripts. Look only for things you already suspect matter." This keeps token costs manageable even with hundreds of sessions.
Phase 3 — Consolidate. The sub-agent merges new information into existing memory files. It performs three key operations: converting relative dates to absolute ones ("yesterday" becomes "2026-03-28"), deleting facts that have been contradicted by newer information, and merging duplicate entries that say the same thing in different ways.
Phase 4 — Prune and Index. Finally, it updates MEMORY.md to stay under 200 lines. Each index entry gets compressed to one line under 150 characters: - [Title](file.md) — one-line hook. Stale pointers are removed, new files get added, and contradictions between the index and actual files are resolved.
The entire cycle typically completes in 8–10 minutes, even for projects with 900+ sessions of accumulated memory.
The Sleep-Time Compute Connection
The metaphor isn't just branding. AutoDream's design directly builds on a research paper from UC Berkeley and the Letta team (the MemGPT creators): "Sleep-time Compute: Beyond Inference Scaling at Test-time" (arXiv:2504.13171, April 2025).
The core finding: models that pre-compute context during idle periods can reduce test-time inference costs by a factor of 5 at equal accuracy, with up to 18% accuracy gains on some tasks. The key insight is that not all compute needs to happen when the user is waiting for a response. Some work — like organizing context, resolving contradictions, and building efficient representations — can happen in the background.
AutoDream applies this principle literally. Instead of Claude Code spending tokens every session re-reading bloated memory files and resolving contradictions on the fly, the consolidation happens once during downtime. The next active session starts with clean, organized context that's cheaper and faster to process.
What It Actually Modifies
AutoDream is sandboxed strictly to memory storage. It writes only to memory files in ~/.claude/projects/<project>/memory/. It cannot touch source code, configuration files, tests, or anything else in your project. A lock file prevents concurrent runs on the same project, and the process runs in the background without blocking active sessions.
The before-and-after difference is concrete. One documented case showed a MEMORY.md with 280+ lines — packed with stale relative dates, contradictory entries about which frameworks were in use, and duplicate notes scattered across files — consolidated down to 142 clean lines with accurate absolute dates and no contradictions.
How to Use It
If you have access (the feature is in gradual rollout as of late March 2026), there are two ways to trigger consolidation:
Automatic: Just keep using Claude Code normally. After 5+ sessions and 24+ hours, AutoDream runs in the background on its own.
Manual: In a Claude Code session, type dream, auto dream, or consolidate my memory files. This bypasses the automatic thresholds and runs immediately.
The server-side configuration looks like this:
{
"minHours": 24,
"minSessions": 5,
"enabled": false
}
The enabled parameter is managed server-side — you can't flip it locally. If you don't see it in your /memory menu yet, you're not in the rollout cohort.
What's Missing
AutoDream solves memory bloat, but it introduces its own questions. The consolidation is opaque — you don't get a diff showing what changed, deleted, or merged. If AutoDream prunes something you actually needed, you won't know until you notice it's gone. There's no undo, no version history of memory states.
The 24-hour minimum also means that for intense multi-day coding sprints where memory bloat accumulates fastest, you're limited to one consolidation per day. And the feature is still gated behind a server-side flag with no timeline for general availability.
There's also a philosophical tension: AutoDream decides what's "stale" and what matters. It makes judgment calls about contradictions — if you switched from Express to Fastify and back to Express, does it keep Express or Fastify? The system prompt gives heuristics, but edge cases are inevitable.
The Bottom Line
AutoDream is the most concrete implementation yet of an idea that's been floating around AI research for a year: that AI agents need something equivalent to sleep. Not rest, but background consolidation — a dedicated phase where accumulated experience gets compressed into durable, efficient memory. The 5x cost reduction from the Sleep-time Compute paper suggests this isn't just a nice-to-have. As AI agents accumulate longer histories across more sessions, background memory management shifts from convenience feature to architectural necessity.