Claude-Mem

persistent memory compression for claude code — how it works

1 — What Is Claude-Mem?

Claude-Mem is a persistent memory system for Claude Code. A dedicated observer AI watches every session in real-time, automatically capturing decisions, bugfixes, discoveries, and patterns as structured observations.

The core problem: Each Claude Code session starts with zero memory of what happened before. Claude-Mem solves this by maintaining a searchable database of everything important that happened across all your sessions — and injecting relevant context at the start of each new one. It uses progressive disclosure to stay token-efficient: lightweight summaries first, full details only on demand.

5
Lifecycle Hooks
6
Observation Types
3
Search Layers
10x
Token Savings
2 — Three-Part Architecture
Hook Layer Edge
5 Claude Code hooks intercept session lifecycle events: SessionStart, UserPromptSubmit, PostToolUse, Stop, SessionEnd. Each hook calls the Worker API via HTTP. Privacy tags stripped here before data reaches storage.
Worker Service Core
HTTP server on port 37777 running via Bun. Handles search, session management, context injection, and observation storage. Manages SQLite database and optional Chroma vector store. Web viewer UI at http://localhost:37777.
SDK Observer Agent AI
A separate Claude subprocess (via Agent SDK) that watches your primary session. Receives tool usage events, generates structured XML observations with type, concepts, narrative, and facts. Runs in an isolated session directory.
3 — Example: Building a Signup Page

Let's walk through a concrete example. You ask Claude to "build a signup page with email validation". Here's exactly what claude-mem does behind the scenes, and what happens in your next session.

SESSION 1 Building the signup page
        sequenceDiagram
          participant You as You
          participant CC as Claude Code
          participant H as Hooks
          participant W as Worker
          participant O as Observer AI

          Note over You,O: SESSION START
          H->>W: SessionStart hook
          W-->>CC: "No previous sessions found"

          You->>CC: "Build a signup page with email validation"
          H->>W: UserPromptSubmit
          W->>O: Spawn observer

          CC->>CC: Read src/app/layout.tsx
          H->>W: PostToolUse (Read)
          W->>O: Feed event

          CC->>CC: Write src/app/signup/page.tsx
          H->>W: PostToolUse (Write)
          W->>O: Feed event
          Note over O: Observation: FEATURE
"Created signup page component
with form and email input" CC->>CC: Write src/lib/validators.ts H->>W: PostToolUse (Write) W->>O: Feed event Note over O: Observation: DECISION
"Used Zod for email validation
over regex for type safety" CC->>CC: Run npm test (fails) H->>W: PostToolUse (Bash) W->>O: Feed event CC->>CC: Edit src/lib/validators.ts H->>W: PostToolUse (Edit) W->>O: Feed event Note over O: Observation: BUGFIX
"Fixed Zod schema - email()
must come before min()" CC->>CC: Run npm test (passes) H->>W: PostToolUse (Bash) Note over You,O: SESSION END H->>W: Stop hook W->>O: Request summary O-->>W: Session summary XML W->>W: Store 3 observations + 1 summary

Session 1: The observer captures 3 observations as you build

STORED What the observer captured
Feature Created signup page with email form obs-a1b2c
Built a Next.js signup page at src/app/signup/page.tsx with email input, password fields, and form submission handler. Uses React Hook Form for state management and server actions for the API call.
what-changed pattern
files: +page.tsx layout.tsx
Decision Chose Zod over regex for email validation obs-d4e5f
Used Zod schema validation instead of raw regex for email validation. Zod provides type inference, composable schemas, and better error messages. The schema is shared between client-side form validation and the server action, ensuring consistency.
trade-off why-it-exists
files: +validators.ts
Bugfix Fixed Zod schema method ordering obs-g7h8i
z.string().min(1).email() fails because .email() must come before .min() in the chain. Zod validates in chain order; the email check was being skipped for empty strings.
gotcha problem-solution
files: ~validators.ts
next day, new session
SESSION 2 Adding OAuth to the signup page

You start a new Claude Code session and ask "add Google OAuth to the signup page". Before Claude even sees your prompt, the SessionStart hook fires and injects this context:

[myapp] recent context, 2026-03-09 9:15am
1 session • 3 observations • saved ~2,400 tokens (reading: 180 tok vs discovering: 2,580 tok)
YESTERDAY 3:42 PM
Feature Created signup page with email form ~50 tok
Decision Chose Zod over regex for email validation ~50 tok
Bugfix Fixed Zod schema method ordering ~50 tok
LAST SESSION SUMMARY
Built signup page at src/app/signup/page.tsx using React Hook Form + Zod validation. Email validation uses Zod's .email().min(1) chain (order matters — gotcha fixed). Server actions handle form submission.
What just happened: Claude now knows about the signup page, the Zod validation pattern, and the method-ordering gotcha — without reading a single file. It cost only ~180 tokens of context injection versus the ~2,580 tokens of work Claude spent discovering all of this yesterday. When Claude adds Google OAuth, it will extend the existing Zod schema correctly (email before min) and follow the React Hook Form + server actions pattern already established.
4 — Hook Pipeline

Claude-Mem installs 5 hooks into Claude Code. Each hook fires at a specific point in the session lifecycle, calling the Worker HTTP API to trigger actions. Exit code 0 = silent success, exit code 2 = show error to Claude.

HOOK 1
SessionStart
Start worker if needed.
Inject past context via
ContextBuilder.
HOOK 2
UserPromptSubmit
Create session record.
Spawn SDK observer
subprocess.
HOOK 3
PostToolUse
Capture every tool call.
Feed to observer agent.
Store observation.
HOOK 4
Stop
Trigger session
summary generation
from observer.
HOOK 5
SessionEnd
Mark session complete.
Clean up state.
Sync to Chroma.
5 — The Observer Agent
        sequenceDiagram
          participant U as Your Claude Session
          participant H as Hook Layer
          participant W as Worker :37777
          participant O as Observer Agent
          participant DB as SQLite + Chroma

          U->>H: User sends prompt
          H->>W: POST /sessions/init
          W->>O: Spawn SDK subprocess
          Note over O: Separate Claude instance
watching your session U->>U: Uses Read tool H->>W: POST /sessions/observations W->>O: Feed tool event O->>O: Analyze + generate XML U->>U: Uses Edit tool H->>W: POST /sessions/observations W->>O: Feed tool event O-->>W: XML observation response W->>DB: Store observation U->>U: Session ends H->>W: POST /sessions/summarize W->>O: Request summary O-->>W: XML summary W->>DB: Store summary

The observer runs as a separate Claude subprocess, watching your primary session in real-time

Two Session IDs: Each session has a contentSessionId (your Claude Code conversation) and a memorySessionId (the SDK observer's internal session). The observer uses resume to maintain context across multiple prompts within the same session.
6 — Observation Types & Concepts

The observer categorizes each observation with exactly one type and 2–5 concepts. Types define what happened, concepts define what kind of knowledge it represents.

Bugfix
Something broken, now fixed
Feature
New capability added
Refactor
Code restructured, same behavior
Decision
Architectural choice + rationale
Discovery
Learning about existing code
Change
Generic modification
Knowledge Concepts (2–5 per observation)
ConceptMeaning
how-it-worksUnderstanding mechanisms and internal behavior
why-it-existsPurpose, rationale, or motivation behind code
what-changedSpecific modifications made
problem-solutionIssues encountered and their fixes
gotchaTraps, edge cases, or surprising behavior
patternReusable approaches or conventions
trade-offPros and cons of a decision
Observation XML Format
<observation>
  <type>bugfix</type>
  <title>Fix race condition in auth</title>
  <subtitle>Token refresh was firing twice due to missing lock</subtitle>
  <facts>
    <fact>Mutex added to prevent concurrent refresh calls</fact>
  </facts>
  <narrative>Full context: what, how, and why...</narrative>
  <concepts>
    <concept>problem-solution</concept>
    <concept>gotcha</concept>
  </concepts>
  <files_modified><file>src/auth/refresh.ts</file></files_modified>
</observation>
7 — Progressive Disclosure

The key to token efficiency. Instead of loading everything into context, claude-mem uses a 3-layer search that gets progressively more expensive. Filter first, fetch details only for what matters. Result: ~10x token savings.

1
Index
search()
Compact list of observation IDs, titles, timestamps, and type badges. Scan and decide what's relevant.
~50 tokens/result
2
Timeline
timeline()
Chronological context around interesting observations. What happened before and after. Interleaves observations, summaries, and prompts.
~200 tokens/result
3
Full Details
get_observations()
Complete narrative, facts, concepts, files read/modified. Only fetched for the specific IDs you selected.
~500-1000 tokens/result
8 — Complete Data Flow
        graph TD
          A["New Session Starts"] --> B["SessionStart Hook"]
          B --> C["ContextBuilder"]
          C --> D["Load Recent Observations"]
          D --> E["Calculate Token Economics"]
          E --> F["Inject Context into Session"]

          F --> G["User Sends Prompt"]
          G --> H["UserPromptSubmit Hook"]
          H --> I["Create Session Record"]
          I --> J["Spawn SDK Observer"]

          F --> K["Tool Executes"]
          K --> L["PostToolUse Hook"]
          L --> M["Strip Privacy Tags"]
          M --> N["Feed to Observer"]
          N --> O["Observer Generates XML"]
          O --> P["Parse Observations"]
          P --> Q{"Deduplicate"}
          Q -->|"New"| R["Save to SQLite"]
          R --> S["Sync to Chroma"]
          Q -->|"Duplicate"| T["Skip"]

          F --> U["Session Ends"]
          U --> V["Stop Hook"]
          V --> W["Generate Summary"]
          W --> X["Store Summary"]
          X --> Y["Mark Session Complete"]
          Y --> Z["Ready for Next Session"]

          classDef start fill:#a78bfa22,stroke:#a78bfa,stroke-width:2px
          classDef hook fill:#22d3ee22,stroke:#22d3ee,stroke-width:2px
          classDef ai fill:#34d39922,stroke:#34d399,stroke-width:2px
          classDef storage fill:#818cf822,stroke:#818cf8,stroke-width:2px
          classDef end fill:#fbbf2422,stroke:#fbbf24,stroke-width:2px
          classDef skip fill:#fb718522,stroke:#fb7185,stroke-width:1.5px

          class A,G,F start
          class B,H,L,V hook
          class C,D,E,J,N,O,P,W ai
          class I,M,Q,R,S,X storage
          class U,Y,Z end
          class T skip
      

Complete data flow from session start through observation capture to next session context injection

9 — Storage Layer
SQLite Primary
WAL mode, memory-mapped I/O (256MB), 10k page cache. Tables: observations, sdk_sessions, session_summaries, user_prompts, pending_messages. Indexed by project, type, concept, date, content hash. Deduplication via SHA256 within 30-second windows.
Chroma Optional
Vector database for semantic search. Embeds observations and summaries for meaning-based retrieval. One collection per project: cm__project_name. Batch sync (100 docs/call). Falls back to SQLite if unavailable.
Database Schema Details
observations — id, memory_session_id, project, type, title, subtitle, narrative, facts (JSON), concepts (JSON), files_read (JSON), files_modified (JSON), prompt_number, discovery_tokens, content_hash, created_at

sdk_sessions — id, content_session_id (unique), memory_session_id (unique, nullable), project, user_prompt, started_at, completed_at, status (active|completed|failed)

session_summaries — id, memory_session_id, project, request, investigated, learned, completed, next_steps, notes, prompt_number, discovery_tokens

Search strategies: SQLite-only (fast metadata), Chroma semantic (meaning-based), Hybrid (metadata filter + semantic ranking)
10 — MCP Tools & API
ToolPurposeCost
search Full-text search with filters (type, date, project, obs_type)Returns compact index with IDs, titles, timestamps ~50 tok/item
timeline Chronological context around an observation or queryInterleaves observations, summaries, and user prompts ~200 tok/item
get_observations Fetch full details by observation IDs (array)Complete narrative, facts, concepts, files ~500-1k tok/item
smart_search AST-based codebase search using tree-sitterStructural code search across your project varies
smart_outline Folded file structure with symbol signaturesBodies collapsed, shows file skeleton varies
smart_unfold Expand a specific symbol from an outlineGet full source code of a function/class varies
HTTP API Endpoints
Search: /api/search, /api/timeline, /api/decisions, /api/changes, /api/how-it-works

Sessions: /api/sessions/init, /api/sessions/observations, /api/sessions/summarize

Data: /api/observations, /api/observations/batch, /api/summaries, /api/stats, /api/projects

Settings: /api/settings, /api/settings/defaults

Viewer: http://localhost:37777 — real-time web UI with SSE streaming at /api/viewer/stream

Health: /api/health
Configuration Settings
Settings stored at ~/.claude-mem/settings.json

AI Provider: CLAUDE_MEM_PROVIDER — claude (default), gemini (free tier), openrouter
Model: CLAUDE_MEM_MODEL — default: claude-sonnet-4-5
Auth: CLAUDE_MEM_CLAUDE_AUTH_METHOD — cli (subscription billing) or api (API key)

Context injection: CLAUDE_MEM_CONTEXT_OBSERVATIONS (max 50), CLAUDE_MEM_CONTEXT_FULL_COUNT (top 3 get full details), CLAUDE_MEM_CONTEXT_SESSION_COUNT (1 summary shown)

Worker: Port 37777 (configurable), data at ~/.claude-mem/
11 — Quick Start
# Install via Claude Code plugin system
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

# Restart Claude Code - that's it!
# Context from previous sessions will automatically
# appear at the start of new sessions.

# View real-time observations in your browser
open http://localhost:37777

# Search memory from within Claude Code
/mem-search decisions about authentication
How it feels: After a few sessions, you'll notice Claude starts each conversation with context like "[myproject] recent context — 3 sessions, 12 observations" followed by a timeline of what happened before. It remembers your architectural decisions, the bugs you fixed, and the patterns you established. Each observation shows its token cost so you can see exactly what you're paying for context.