Chat Recall: Giving Claude Code a memory between sessions

Technical automation

Claude Code forgets everything between sessions. ChatRecall fixes it with a Stop hook, SQLite FTS5, and Haiku summarization. Here is how it works and how you can build it too.

May 18, 2026 · By Sagheer Ahmed

If you follow discussions about Claude or AI coding tools, one complaint comes up constantly: Claude forgets everything the moment a session ends. It is not a bug — it is how the tool is designed. But for anyone using Claude Code seriously across multiple projects, the friction is real. You re-explain context you already covered last week. You write the same background into handoff files just so Claude can pick it up next time. You watch the tool give a slightly different answer to the same architectural question because it has no memory of what you agreed on two sessions ago.

This is not a small annoyance. It is one of the main reasons people hit a ceiling with Claude Code after a few weeks of heavy use.

ChatRecall is a lightweight session memory system that fixes this. It captures every Claude Code session automatically, summarizes it with a single Haiku call, and makes everything searchable. The system is around 300 lines of Python and works with any Claude Code project on Windows. This post explains how it is built, what obstacles came up, and how you can set it up yourself. The source code is on GitHub.

How it works

There are three parts, each with a different cost and latency profile:

Capture: After every Claude Code response, a Stop hook silently appends the exchange to a SQLite database. No API calls, no cost, runs in under 100ms. Every session is captured even if you forget to do anything else.

Summarize: When you run /save-recall, a script reads all the exchanges from the current session and calls Claude Haiku to produce a 2-3 sentence summary plus searchable keywords. One API call, about $0.005.

Search: Run /search-chats <query> and get the top 5 matching sessions, ranked by relevance, opened as a styled HTML page in your browser.

The separation between capture and summarization is a deliberate design choice. The hook has to be fast — it runs after every response and blocks the UI until it finishes. It does nothing except write raw exchanges to SQLite. Summarization, which involves an API call, happens separately on a schedule or on demand. This means every session is safely captured even if you close the window without thinking about it.

The Stop hook

Claude Code supports hooks — shell commands that run at specific lifecycle events. The Stop hook fires after every assistant response. Here is the relevant section of settings.json:

JSON

{
  "hooks": {
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python \"C:\\Users\\<username>\\.claude\\chatrecall\\append_exchange.py\""
          }
        ]
      }
    ]
  }
}

When the hook fires, Claude Code sends a JSON payload on stdin containing the session_id, the path to the session's JSONL transcript file (transcript_path), and the current working directory (cwd). The script reads that payload, opens the JSONL file, seeks to a stored watermark position, reads any new lines since the last run, and inserts them into SQLite.

The watermark is the important detail. Without it, every hook invocation would re-read the entire transcript from the beginning. With it, the hook always reads only the new lines — constant time regardless of how long the session has been running.

The database schema

Three tables make up the schema:

sessions — one row per Claude Code session. Holds the session ID, project path, Haiku-generated summary, comma-separated topics, key decisions, and a watermark column tracking how far through the JSONL the hook has read.

exchanges — one row per user or assistant turn. The dedup key is (session_id, seq) where seq is the line number in the source JSONL. Duplicate inserts from the hook are silently ignored.

sessions_fts — an FTS5 virtual table that indexes summarized sessions:

SQL

CREATE VIRTUAL TABLE sessions_fts USING fts5(
    session_id UNINDEXED,
    summary,
    topics,
    decisions,
    project_name
);

Why FTS5 on the summaries rather than the raw exchanges? Raw exchange content is noisy — tool call output, JSON payloads, code blocks, error messages. Claude Haiku distills all of that into clean sentences with specific proper nouns. Searching the distilled output gives noticeably better results than keyword-matching raw text.

Summarization with Haiku

Haiku is the right model for this task. The job is straightforward — read a conversation, extract key points, return structured JSON — and Haiku handles it reliably at a fraction of the cost of Sonnet or Opus. The prompt asks for four fields and returns strict JSON with no markdown wrapper:

JSON

{
  "summary": "2-3 sentences describing what was discussed and accomplished",
  "topics": "comma-separated keywords -- project names, technologies, problem descriptions",
  "project_name": "main project name or 'general'",
  "decisions": "key decisions; semicolon-separated"
}

The topics field is where search quality comes from. If the session covered an AOAG failover issue, topics should include "AOAG", "failover", "Always On", "availability group" — the specific terms you would search for later. Generic words like "discussed" go in the summary, not the topics.

Cost: about $0.005 per session. At one session per day, that is around $1.80 per year.

Search

The search query runs FTS5 MATCH against the indexed sessions:

SQL

SELECT s.session_id, s.project_name, s.start_time, s.summary, s.topics
FROM sessions_fts f
JOIN sessions s ON f.session_id = s.session_id
WHERE sessions_fts MATCH ?
ORDER BY rank
LIMIT 5;

Results are rendered as a styled HTML page and opened in the browser automatically. Each result shows the date, project name, summary, and topic keywords. No JSON to parse, no terminal output to scan.

Two real obstacles

SSL certificate failure on Windows

On the first run, the summarization script failed with a generic connection error. The actual underlying error was:

CODE

[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate

Python's certifi bundle does not include root CAs installed by security software, VPN clients, or corporate network tools. Those programs install their own CA into the Windows certificate store so they can inspect HTTPS traffic. Python uses its own SSL bundle and does not trust them.

After appending the Windows root certificates to certifi's bundle, the error changed to:

CODE

certificate verify failed: Basic Constraints of CA cert not marked critical

Python 3.13's stricter OpenSSL was rejecting a CA certificate missing the basicConstraints critical flag — technically required by RFC 5280 but ignored by some older or software-generated CAs.

The fix was the truststore package. It tells Python to use the native Windows SChannel instead of OpenSSL — the same trust mechanism your browser uses:

PYTHON

try:
    import truststore
    truststore.inject_into_ssl()
except ImportError:
    pass

The try/except ImportError means the script still works on machines without truststore — it falls back to default SSL behavior. This fix applies to any Python script calling HTTPS APIs on a Windows machine with security software installed. It is not specific to the Anthropic SDK, and you will likely hit this problem the first time you call any external API from Python on a managed Windows machine.

Task Scheduler duration limit

When registering the hourly automation via PowerShell, passing [TimeSpan]::MaxValue for the repetition duration causes Task Scheduler to reject it:

CODE

The task XML contains a value which is incorrectly formatted or out of range.
(8,42):Duration:P99999999DT23H59M59S

The fix: use (New-TimeSpan -Days 9999) — about 27 years, which is effectively permanent and within the scheduler's valid range.

Automating summarization

The first version required running /save-recall manually at the end of each session. That works, but it is easy to forget, especially when a session ends by closing the window.

The second version uses a Windows Task Scheduler job with pythonw.exe — the windowless Python interpreter that runs with no console popup — that calls summarize_session.py --latest every hour and at every user logon. The --latest flag finds the most recently active session that has not been summarized and processes it. If everything is already up to date, the script exits cleanly with no output.

The result: any session from the past hour gets summarized without any user action.

Setting this up yourself

The system has five Python scripts and two Claude Code skill files:

CODE

C:\Users\<username>\.claude\chatrecall\
    init_db.py              -- run once to create the schema
    append_exchange.py      -- Stop hook: reads JSONL, writes to SQLite
    summarize_session.py    -- calls Haiku, writes session summary + FTS index
    search_sessions.py      -- FTS5 query, generates HTML results page
    common.py               -- shared DB connection and helpers

.claude\commands\
    save-recall.md          -- /save-recall skill
    search-chats.md         -- /search-chats skill

The install script handles steps 3 and 4 automatically. From the repo root:

powershell -ExecutionPolicy Bypass -File install-chatrecall.ps1

Or set it up manually. The steps are:

Install dependencies: pip install anthropic truststore
Set ANTHROPIC_API_KEY as an environment variable
Run python init_db.py once to create the database
Add the Stop hook to settings.json (restart Claude Code after)
Register the hourly scheduled task with Task Scheduler

After that, every session is captured automatically. Run /save-recall to summarize immediately, or wait for the scheduled task. Run /search-chats <query> when you want to find a past session.

The whole system costs nothing to run except the Haiku API calls — and at $0.005 per session, you are unlikely to notice it on your bill.

Technical reference

The full source for ChatRecall and other projects is on my GitHub profile.