Claude-Mem
persistent memory compression for claude code — how it works
Claude-Mem is a persistent memory system for Claude Code. A dedicated observer AI watches every session in real-time, automatically capturing decisions, bugfixes, discoveries, and patterns as structured observations.
The core problem: Each Claude Code session starts with zero memory of what happened before. Claude-Mem solves this by maintaining a searchable database of everything important that happened across all your sessions — and injecting relevant context at the start of each new one. It uses progressive disclosure to stay token-efficient: lightweight summaries first, full details only on demand.
SessionStart,
UserPromptSubmit, PostToolUse, Stop, SessionEnd.
Each hook calls the Worker API via HTTP. Privacy tags stripped here before data reaches storage.port 37777 running via Bun. Handles search, session management,
context injection, and observation storage. Manages SQLite database and optional Chroma vector store.
Web viewer UI at http://localhost:37777.Let's walk through a concrete example. You ask Claude to "build a signup page with email validation". Here's exactly what claude-mem does behind the scenes, and what happens in your next session.
sequenceDiagram
participant You as You
participant CC as Claude Code
participant H as Hooks
participant W as Worker
participant O as Observer AI
Note over You,O: SESSION START
H->>W: SessionStart hook
W-->>CC: "No previous sessions found"
You->>CC: "Build a signup page with email validation"
H->>W: UserPromptSubmit
W->>O: Spawn observer
CC->>CC: Read src/app/layout.tsx
H->>W: PostToolUse (Read)
W->>O: Feed event
CC->>CC: Write src/app/signup/page.tsx
H->>W: PostToolUse (Write)
W->>O: Feed event
Note over O: Observation: FEATURE
"Created signup page component
with form and email input"
CC->>CC: Write src/lib/validators.ts
H->>W: PostToolUse (Write)
W->>O: Feed event
Note over O: Observation: DECISION
"Used Zod for email validation
over regex for type safety"
CC->>CC: Run npm test (fails)
H->>W: PostToolUse (Bash)
W->>O: Feed event
CC->>CC: Edit src/lib/validators.ts
H->>W: PostToolUse (Edit)
W->>O: Feed event
Note over O: Observation: BUGFIX
"Fixed Zod schema - email()
must come before min()"
CC->>CC: Run npm test (passes)
H->>W: PostToolUse (Bash)
Note over You,O: SESSION END
H->>W: Stop hook
W->>O: Request summary
O-->>W: Session summary XML
W->>W: Store 3 observations + 1 summary
Session 1: The observer captures 3 observations as you build
src/app/signup/page.tsx
with email input, password fields, and form submission handler. Uses React Hook Form for state management
and server actions for the API call.
z.string().min(1).email() fails because
.email() must come before
.min() in the chain.
Zod validates in chain order; the email check was being skipped for empty strings.
You start a new Claude Code session and ask "add Google OAuth to the signup page".
Before Claude even sees your prompt, the SessionStart
hook fires and injects this context:
Claude-Mem installs 5 hooks into Claude Code. Each hook fires at a specific point in the session lifecycle, calling the Worker HTTP API to trigger actions. Exit code 0 = silent success, exit code 2 = show error to Claude.
Inject past context via
ContextBuilder.
Spawn SDK observer
subprocess.
Feed to observer agent.
Store observation.
summary generation
from observer.
Clean up state.
Sync to Chroma.
sequenceDiagram
participant U as Your Claude Session
participant H as Hook Layer
participant W as Worker :37777
participant O as Observer Agent
participant DB as SQLite + Chroma
U->>H: User sends prompt
H->>W: POST /sessions/init
W->>O: Spawn SDK subprocess
Note over O: Separate Claude instance
watching your session
U->>U: Uses Read tool
H->>W: POST /sessions/observations
W->>O: Feed tool event
O->>O: Analyze + generate XML
U->>U: Uses Edit tool
H->>W: POST /sessions/observations
W->>O: Feed tool event
O-->>W: XML observation response
W->>DB: Store observation
U->>U: Session ends
H->>W: POST /sessions/summarize
W->>O: Request summary
O-->>W: XML summary
W->>DB: Store summary
The observer runs as a separate Claude subprocess, watching your primary session in real-time
contentSessionId
(your Claude Code conversation) and a memorySessionId
(the SDK observer's internal session). The observer uses resume
to maintain context across multiple prompts within the same session.
The observer categorizes each observation with exactly one type and 2–5 concepts. Types define what happened, concepts define what kind of knowledge it represents.
| Concept | Meaning |
|---|---|
how-it-works | Understanding mechanisms and internal behavior |
why-it-exists | Purpose, rationale, or motivation behind code |
what-changed | Specific modifications made |
problem-solution | Issues encountered and their fixes |
gotcha | Traps, edge cases, or surprising behavior |
pattern | Reusable approaches or conventions |
trade-off | Pros and cons of a decision |
Observation XML Format
The key to token efficiency. Instead of loading everything into context, claude-mem uses a 3-layer search that gets progressively more expensive. Filter first, fetch details only for what matters. Result: ~10x token savings.
search()timeline()get_observations()
graph TD
A["New Session Starts"] --> B["SessionStart Hook"]
B --> C["ContextBuilder"]
C --> D["Load Recent Observations"]
D --> E["Calculate Token Economics"]
E --> F["Inject Context into Session"]
F --> G["User Sends Prompt"]
G --> H["UserPromptSubmit Hook"]
H --> I["Create Session Record"]
I --> J["Spawn SDK Observer"]
F --> K["Tool Executes"]
K --> L["PostToolUse Hook"]
L --> M["Strip Privacy Tags"]
M --> N["Feed to Observer"]
N --> O["Observer Generates XML"]
O --> P["Parse Observations"]
P --> Q{"Deduplicate"}
Q -->|"New"| R["Save to SQLite"]
R --> S["Sync to Chroma"]
Q -->|"Duplicate"| T["Skip"]
F --> U["Session Ends"]
U --> V["Stop Hook"]
V --> W["Generate Summary"]
W --> X["Store Summary"]
X --> Y["Mark Session Complete"]
Y --> Z["Ready for Next Session"]
classDef start fill:#a78bfa22,stroke:#a78bfa,stroke-width:2px
classDef hook fill:#22d3ee22,stroke:#22d3ee,stroke-width:2px
classDef ai fill:#34d39922,stroke:#34d399,stroke-width:2px
classDef storage fill:#818cf822,stroke:#818cf8,stroke-width:2px
classDef end fill:#fbbf2422,stroke:#fbbf24,stroke-width:2px
classDef skip fill:#fb718522,stroke:#fb7185,stroke-width:1.5px
class A,G,F start
class B,H,L,V hook
class C,D,E,J,N,O,P,W ai
class I,M,Q,R,S,X storage
class U,Y,Z end
class T skip
Complete data flow from session start through observation capture to next session context injection
observations,
sdk_sessions, session_summaries, user_prompts, pending_messages.
Indexed by project, type, concept, date, content hash. Deduplication via SHA256 within 30-second windows.
cm__project_name. Batch sync (100 docs/call).
Falls back to SQLite if unavailable.
Database Schema Details
sdk_sessions — id, content_session_id (unique), memory_session_id (unique, nullable), project, user_prompt, started_at, completed_at, status (active|completed|failed)
session_summaries — id, memory_session_id, project, request, investigated, learned, completed, next_steps, notes, prompt_number, discovery_tokens
Search strategies: SQLite-only (fast metadata), Chroma semantic (meaning-based), Hybrid (metadata filter + semantic ranking)
| Tool | Purpose | Cost |
|---|---|---|
search |
Full-text search with filters (type, date, project, obs_type)Returns compact index with IDs, titles, timestamps | ~50 tok/item |
timeline |
Chronological context around an observation or queryInterleaves observations, summaries, and user prompts | ~200 tok/item |
get_observations |
Fetch full details by observation IDs (array)Complete narrative, facts, concepts, files | ~500-1k tok/item |
smart_search |
AST-based codebase search using tree-sitterStructural code search across your project | varies |
smart_outline |
Folded file structure with symbol signaturesBodies collapsed, shows file skeleton | varies |
smart_unfold |
Expand a specific symbol from an outlineGet full source code of a function/class | varies |
HTTP API Endpoints
/api/search, /api/timeline, /api/decisions, /api/changes, /api/how-it-worksSessions:
/api/sessions/init, /api/sessions/observations, /api/sessions/summarizeData:
/api/observations, /api/observations/batch, /api/summaries, /api/stats, /api/projectsSettings:
/api/settings, /api/settings/defaultsViewer:
http://localhost:37777 — real-time web UI with SSE streaming at /api/viewer/streamHealth:
/api/health
Configuration Settings
~/.claude-mem/settings.jsonAI Provider:
CLAUDE_MEM_PROVIDER — claude (default), gemini (free tier), openrouterModel:
CLAUDE_MEM_MODEL — default: claude-sonnet-4-5Auth:
CLAUDE_MEM_CLAUDE_AUTH_METHOD — cli (subscription billing) or api (API key)Context injection:
CLAUDE_MEM_CONTEXT_OBSERVATIONS (max 50), CLAUDE_MEM_CONTEXT_FULL_COUNT (top 3 get full details),
CLAUDE_MEM_CONTEXT_SESSION_COUNT (1 summary shown)Worker: Port 37777 (configurable), data at
~/.claude-mem/