Claude-Mem

persistent memory compression for claude code — how it works

1 — What Is Claude-Mem?

Claude-Mem is a persistent memory system for Claude Code. A dedicated observer AI watches every session in real-time, automatically capturing decisions, bugfixes, discoveries, and patterns as structured observations.

The core problem: Each Claude Code session starts with zero memory of what happened before. Claude-Mem solves this by maintaining a searchable database of everything important that happened across all your sessions — and injecting relevant context at the start of each new one. It uses progressive disclosure to stay token-efficient: lightweight summaries first, full details only on demand.

Lifecycle Hooks

Observation Types

Search Layers

10x

Token Savings

2 — Three-Part Architecture

Hook Layer Edge

5 Claude Code hooks intercept session lifecycle events: SessionStart, UserPromptSubmit, PostToolUse, Stop, SessionEnd. Each hook calls the Worker API via HTTP. Privacy tags stripped here before data reaches storage.

Worker Service Core

HTTP server on port 37777 running via Bun. Handles search, session management, context injection, and observation storage. Manages SQLite database and optional Chroma vector store. Web viewer UI at http://localhost:37777.

SDK Observer Agent AI

A separate Claude subprocess (via Agent SDK) that watches your primary session. Receives tool usage events, generates structured XML observations with type, concepts, narrative, and facts. Runs in an isolated session directory.

3 — Example: Building a Signup Page

Let's walk through a concrete example. You ask Claude to "build a signup page with email validation". Here's exactly what claude-mem does behind the scenes, and what happens in your next session.

SESSION 1 Building the signup page

        sequenceDiagram
          participant You as You
          participant CC as Claude Code
          participant H as Hooks
          participant W as Worker
          participant O as Observer AI

          Note over You,O: SESSION START
          H->>W: SessionStart hook
          W-->>CC: "No previous sessions found"

          You->>CC: "Build a signup page with email validation"
          H->>W: UserPromptSubmit
          W->>O: Spawn observer

          CC->>CC: Read src/app/layout.tsx
          H->>W: PostToolUse (Read)
          W->>O: Feed event

          CC->>CC: Write src/app/signup/page.tsx
          H->>W: PostToolUse (Write)
          W->>O: Feed event
          Note over O: Observation: FEATURE
"Created signup page component
with form and email input"

          CC->>CC: Write src/lib/validators.ts
          H->>W: PostToolUse (Write)
          W->>O: Feed event
          Note over O: Observation: DECISION
"Used Zod for email validation
over regex for type safety"

          CC->>CC: Run npm test (fails)
          H->>W: PostToolUse (Bash)
          W->>O: Feed event

          CC->>CC: Edit src/lib/validators.ts
          H->>W: PostToolUse (Edit)
          W->>O: Feed event
          Note over O: Observation: BUGFIX
"Fixed Zod schema - email()
must come before min()"

          CC->>CC: Run npm test (passes)
          H->>W: PostToolUse (Bash)

          Note over You,O: SESSION END
          H->>W: Stop hook
          W->>O: Request summary
          O-->>W: Session summary XML
          W->>W: Store 3 observations + 1 summary

Session 1: The observer captures 3 observations as you build

STORED What the observer captured

Feature Created signup page with email form obs-a1b2c

Built a Next.js signup page at src/app/signup/page.tsx with email input, password fields, and form submission handler. Uses React Hook Form for state management and server actions for the API call.

what-changed pattern

files: +page.tsx layout.tsx

Decision Chose Zod over regex for email validation obs-d4e5f

Used Zod schema validation instead of raw regex for email validation. Zod provides type inference, composable schemas, and better error messages. The schema is shared between client-side form validation and the server action, ensuring consistency.

trade-off why-it-exists

files: +validators.ts

Bugfix Fixed Zod schema method ordering obs-g7h8i

z.string().min(1).email() fails because .email() must come before .min() in the chain. Zod validates in chain order; the email check was being skipped for empty strings.

gotcha problem-solution

files: ~validators.ts

next day, new session

SESSION 2 Adding OAuth to the signup page

You start a new Claude Code session and ask "add Google OAuth to the signup page". Before Claude even sees your prompt, the SessionStart hook fires and injects this context:

[myapp] recent context, 2026-03-09 9:15am

1 session • 3 observations • saved ~2,400 tokens (reading: 180 tok vs discovering: 2,580 tok)

YESTERDAY 3:42 PM

Feature Created signup page with email form ~50 tok

Decision Chose Zod over regex for email validation ~50 tok

Bugfix Fixed Zod schema method ordering ~50 tok

LAST SESSION SUMMARY

Built signup page at src/app/signup/page.tsx using React Hook Form + Zod validation. Email validation uses Zod's .email().min(1) chain (order matters — gotcha fixed). Server actions handle form submission.

What just happened: Claude now knows about the signup page, the Zod validation pattern, and the method-ordering gotcha — without reading a single file. It cost only ~180 tokens of context injection versus the ~2,580 tokens of work Claude spent discovering all of this yesterday. When Claude adds Google OAuth, it will extend the existing Zod schema correctly (email before min) and follow the React Hook Form + server actions pattern already established.

4 — Hook Pipeline

Claude-Mem installs 5 hooks into Claude Code. Each hook fires at a specific point in the session lifecycle, calling the Worker HTTP API to trigger actions. Exit code 0 = silent success, exit code 2 = show error to Claude.

HOOK 1

SessionStart

Start worker if needed.
Inject past context via
ContextBuilder.

→

HOOK 2

UserPromptSubmit

Create session record.
Spawn SDK observer
subprocess.

→

HOOK 3

PostToolUse

Capture every tool call.
Feed to observer agent.
Store observation.

→

HOOK 4

Stop

Trigger session
summary generation
from observer.

→

HOOK 5

SessionEnd

Mark session complete.
Clean up state.
Sync to Chroma.

5 — The Observer Agent

        sequenceDiagram
          participant U as Your Claude Session
          participant H as Hook Layer
          participant W as Worker :37777
          participant O as Observer Agent
          participant DB as SQLite + Chroma

          U->>H: User sends prompt
          H->>W: POST /sessions/init
          W->>O: Spawn SDK subprocess
          Note over O: Separate Claude instance
watching your session

          U->>U: Uses Read tool
          H->>W: POST /sessions/observations
          W->>O: Feed tool event
          O->>O: Analyze + generate XML

          U->>U: Uses Edit tool
          H->>W: POST /sessions/observations
          W->>O: Feed tool event
          O-->>W: XML observation response
          W->>DB: Store observation

          U->>U: Session ends
          H->>W: POST /sessions/summarize
          W->>O: Request summary
          O-->>W: XML summary
          W->>DB: Store summary

The observer runs as a separate Claude subprocess, watching your primary session in real-time

Two Session IDs: Each session has a contentSessionId (your Claude Code conversation) and a memorySessionId (the SDK observer's internal session). The observer uses resume to maintain context across multiple prompts within the same session.

6 — Observation Types & Concepts

The observer categorizes each observation with exactly one type and 2–5 concepts. Types define what happened, concepts define what kind of knowledge it represents.

●

Bugfix

Something broken, now fixed

●

Feature

New capability added

●

Refactor

Code restructured, same behavior

●

Decision

Architectural choice + rationale

●

Discovery

Learning about existing code

●

Change

Generic modification

Knowledge Concepts (2–5 per observation)

Concept	Meaning
`how-it-works`	Understanding mechanisms and internal behavior
`why-it-exists`	Purpose, rationale, or motivation behind code
`what-changed`	Specific modifications made
`problem-solution`	Issues encountered and their fixes
`gotcha`	Traps, edge cases, or surprising behavior
`pattern`	Reusable approaches or conventions
`trade-off`	Pros and cons of a decision

Observation XML Format

<observation>
  <type>bugfix</type>
  <title>Fix race condition in auth</title>
  <subtitle>Token refresh was firing twice due to missing lock</subtitle>
  <facts>
    <fact>Mutex added to prevent concurrent refresh calls</fact>
  </facts>
  <narrative>Full context: what, how, and why...</narrative>
  <concepts>
    <concept>problem-solution</concept>
    <concept>gotcha</concept>
  </concepts>
  <files_modified><file>src/auth/refresh.ts</file></files_modified>
</observation>

7 — Progressive Disclosure

The key to token efficiency. Instead of loading everything into context, claude-mem uses a 3-layer search that gets progressively more expensive. Filter first, fetch details only for what matters. Result: ~10x token savings.

Index

search()

Compact list of observation IDs, titles, timestamps, and type badges. Scan and decide what's relevant.

~50 tokens/result

Timeline

timeline()

Chronological context around interesting observations. What happened before and after. Interleaves observations, summaries, and prompts.

~200 tokens/result

Full Details

get_observations()

Complete narrative, facts, concepts, files read/modified. Only fetched for the specific IDs you selected.

~500-1000 tokens/result

8 — Complete Data Flow

        graph TD
          A["New Session Starts"] --> B["SessionStart Hook"]
          B --> C["ContextBuilder"]
          C --> D["Load Recent Observations"]
          D --> E["Calculate Token Economics"]
          E --> F["Inject Context into Session"]

          F --> G["User Sends Prompt"]
          G --> H["UserPromptSubmit Hook"]
          H --> I["Create Session Record"]
          I --> J["Spawn SDK Observer"]

          F --> K["Tool Executes"]
          K --> L["PostToolUse Hook"]
          L --> M["Strip Privacy Tags"]
          M --> N["Feed to Observer"]
          N --> O["Observer Generates XML"]
          O --> P["Parse Observations"]
          P --> Q{"Deduplicate"}
          Q -->|"New"| R["Save to SQLite"]
          R --> S["Sync to Chroma"]
          Q -->|"Duplicate"| T["Skip"]

          F --> U["Session Ends"]
          U --> V["Stop Hook"]
          V --> W["Generate Summary"]
          W --> X["Store Summary"]
          X --> Y["Mark Session Complete"]
          Y --> Z["Ready for Next Session"]

          classDef start fill:#a78bfa22,stroke:#a78bfa,stroke-width:2px
          classDef hook fill:#22d3ee22,stroke:#22d3ee,stroke-width:2px
          classDef ai fill:#34d39922,stroke:#34d399,stroke-width:2px
          classDef storage fill:#818cf822,stroke:#818cf8,stroke-width:2px
          classDef end fill:#fbbf2422,stroke:#fbbf24,stroke-width:2px
          classDef skip fill:#fb718522,stroke:#fb7185,stroke-width:1.5px

          class A,G,F start
          class B,H,L,V hook
          class C,D,E,J,N,O,P,W ai
          class I,M,Q,R,S,X storage
          class U,Y,Z end
          class T skip

Complete data flow from session start through observation capture to next session context injection

9 — Storage Layer

SQLite Primary

WAL mode, memory-mapped I/O (256MB), 10k page cache. Tables: observations, sdk_sessions, session_summaries, user_prompts, pending_messages. Indexed by project, type, concept, date, content hash. Deduplication via SHA256 within 30-second windows.

Chroma Optional

Vector database for semantic search. Embeds observations and summaries for meaning-based retrieval. One collection per project: cm__project_name. Batch sync (100 docs/call). Falls back to SQLite if unavailable.

Database Schema Details

observations — id, memory_session_id, project, type, title, subtitle, narrative, facts (JSON), concepts (JSON), files_read (JSON), files_modified (JSON), prompt_number, discovery_tokens, content_hash, created_at

sdk_sessions — id, content_session_id (unique), memory_session_id (unique, nullable), project, user_prompt, started_at, completed_at, status (active|completed|failed)

session_summaries — id, memory_session_id, project, request, investigated, learned, completed, next_steps, notes, prompt_number, discovery_tokens

Search strategies: SQLite-only (fast metadata), Chroma semantic (meaning-based), Hybrid (metadata filter + semantic ranking)

10 — MCP Tools & API

Tool	Purpose	Cost
`search`	Full-text search with filters (type, date, project, obs_type)Returns compact index with IDs, titles, timestamps	~50 tok/item
`timeline`	Chronological context around an observation or queryInterleaves observations, summaries, and user prompts	~200 tok/item
`get_observations`	Fetch full details by observation IDs (array)Complete narrative, facts, concepts, files	~500-1k tok/item
`smart_search`	AST-based codebase search using tree-sitterStructural code search across your project	varies
`smart_outline`	Folded file structure with symbol signaturesBodies collapsed, shows file skeleton	varies
`smart_unfold`	Expand a specific symbol from an outlineGet full source code of a function/class	varies

HTTP API Endpoints

Search: /api/search, /api/timeline, /api/decisions, /api/changes, /api/how-it-works

Sessions: /api/sessions/init, /api/sessions/observations, /api/sessions/summarize

Data: /api/observations, /api/observations/batch, /api/summaries, /api/stats, /api/projects

Settings: /api/settings, /api/settings/defaults

Viewer: http://localhost:37777 — real-time web UI with SSE streaming at /api/viewer/stream

Health: /api/health

Configuration Settings

Settings stored at ~/.claude-mem/settings.json

AI Provider: CLAUDE_MEM_PROVIDER — claude (default), gemini (free tier), openrouter
Model: CLAUDE_MEM_MODEL — default: claude-sonnet-4-5
Auth: CLAUDE_MEM_CLAUDE_AUTH_METHOD — cli (subscription billing) or api (API key)

Context injection: CLAUDE_MEM_CONTEXT_OBSERVATIONS (max 50), CLAUDE_MEM_CONTEXT_FULL_COUNT (top 3 get full details), CLAUDE_MEM_CONTEXT_SESSION_COUNT (1 summary shown)

Worker: Port 37777 (configurable), data at ~/.claude-mem/

11 — Quick Start

# Install via Claude Code plugin system
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

# Restart Claude Code - that's it!
# Context from previous sessions will automatically
# appear at the start of new sessions.

# View real-time observations in your browser
open http://localhost:37777

# Search memory from within Claude Code
/mem-search decisions about authentication

How it feels: After a few sessions, you'll notice Claude starts each conversation with context like "[myproject] recent context — 3 sessions, 12 observations" followed by a timeline of what happened before. It remembers your architectural decisions, the bugs you fixed, and the patterns you established. Each observation shows its token cost so you can see exactly what you're paying for context.