A friend dropped this in a dev community: âGit isnât just useful for humans â itâs equally useful for Claude Code. Good git habits for humans are good git habits for AI agents.â
My first thought: âIâve got Memory.md. Isnât that enough?â
Two weeks of experiments later, turns out these two things solve completely different problems.
Memory isnât just two layers
I used to think AI agent memory had two options: context window runs out, open Memory.md. Memory.md gets bloated, trim it somehow.
Turns out Claude Code already has three layers of memory built in. Git is a fourth:
| Layer | What it is | How it works | What goes here |
|---|---|---|---|
| CLAUDE.md | Project rules file | Fully loaded every session | Stable rules, conventions, build commands |
| Auto Memory | ~/.claude/.../memory/ | First 200 lines of MEMORY.md auto-loaded; topic files loaded on demand | Things Claude learned, past mistakes |
| Session Memory | Auto-generated summaries | Injected as background knowledge, no manual effort | What happened last session |
| Git History | Commit log + diff + blame | Queried on demand | Full change history, decision context |
The first three are built by Anthropic. The fourth has always been there â most people just havenât thought of Git deliberately as an AI memory tool.
The key to this layering isnât the type of information. Itâs query frequency:
- Need it every time â CLAUDE.md
- Brought in automatically (you donât manage it) â Session Memory
- Occasionally need to go deep â Auto Memory topic files
- Look it up when needed â Git
Anthropicâs best practices are direct about CLAUDE.md: keep it under 200 lines, and for each line ask yourself âwould removing this cause Claude to make mistakes?â If not, cut it. Information that changes frequently â like project status â doesnât belong here.
So what does Git add that the other layers donât?
Session Memory already summarizes what happened last time. Auto Memory remembers past mistakes. Whatâs Git for?
The difference is granularity and queryability.
Session Memory is a summary â it tells you âwe were working on the login feature yesterday.â But if you need to know âwhy does the token refresh logic use fixed wait instead of exponential backoff?â â Session Memory doesnât have that detail. Git blame does.
Auto Memory captures general lessons learned. Git captures âthis specific file, this specific line, at this specific time, was changed for this specific reason.â
In plain terms: Session Memory and Auto Memory are notes. Git is the full security footage. You donât watch security footage every day, but when something goes wrong itâs the only thing that can tell you exactly what happened.
How I actually use this: four patterns
Commit messages as compressed summaries
I added a rule to my CLAUDE.md: commit messages must explain what was done and why.
The result: next session, run git log --oneline -15. Fifteen lines, full recent context. The AIâs own log entries are more accurate than my verbal descriptions â they carry timestamps, file scope, and reflect its understanding at the moment of completion, not a hazy recollection the next day.
To be clear: âcommit after every small changeâ is not some industry consensus. Anthropicâs best practices place commit as the final step in an explore â plan â implement cycle. Thereâs also academic disagreement â the Lore paper (arXiv 2603.15566) argues commits should be fewer but richer, each carrying full decision context (why A over B, what constraints applied).
My own practice falls somewhere in between: I donât commit every line change, but I commit after each logically complete piece of work. This is personal preference, not a best practice.
git diff before commit â self-review
Claude Code will sometimes confidently report âdoneâ when changes didnât actually land correctly. Hallucination isnât just an answering-questions problem â it happens during file edits too.
My Pre-Commit Checklist includes: before committing, run git diff and review your own changes.
This has caught: editing file A but forgetting to sync file B, thinking a line was deleted when it wasnât, adding a new dependency without updating the build config.
Let it write first, then check its own work. Verification is easier than generation â true for humans, true for AI.
git blame for decision archaeology
I once spotted some retry logic with weird backoff parameters. My instinct: âjust change it to standard exponential backoff.â
Then I ran git blame. A commit message from three months back read: âAPI rate limiting rule is wait 5s after 429, not exponential backoff.â
Without Git, this code wouldâve been âfixedâ and then actually broken. These kinds of design decisions donât get captured by Session Memory, Auto Memory, or Memory.md. But Git has all of them â as long as you wrote decent commit messages at the time.
git blame src/auth/token-manager.ts
git show abc1234 -s
git worktree for parallel agents
Run two Claude Code sessions simultaneously, each working on a different approach, no interference:
git worktree add ../myproject-experiment -b experiment/approach-A
cd ../myproject-experiment && claude
Main branch stays untouched. Experiment blows up? Delete the directory. Works out? Merge it back.
This isnât something I invented â VILA-Labâs analysis of Claude Codeâs architecture (arXiv 2604.14228) found that Claude Code internally runs sub-agents in isolated git worktrees for complex tasks. Lettaâs Context Repository system uses the same pattern.
A side benefit: pre-commit hooks
Not really about memory, but a bonus that comes with using Git deliberately.
With Gitâs undo safety net, the AI does get bolder about making changes. Pre-commit hooks run lint and type-check; if they fail, the commit doesnât go through:
# .git/hooks/pre-commit or via husky
npm run lint
npm run type-check
npm run test -- --bail
One gotcha: if your hooks take more than ~30 seconds, Claude Code will try to skip them (--no-verify). Keep hooks under 10 seconds.
How I split responsibilities
After using this setup for a while, my division of labor looks roughly like this:
CLAUDE.md (stable, lean, loaded every time):
- Build and test commands
- Code style rules
- Hard constraints and landmines
- Architecture decisions (current, not historical)
Auto Memory (let Claude manage it):
- Mistakes and lessons learned
- Debug patterns and solutions
- Project-specific patterns
Session Memory (hands-off, automatic):
- Summary of last session
Git (query when needed):
- âWhere did we leave off?â â
git log - âHow was that bug fixed?â â
git log -S "keyword" - âWhy does this code look like that?â â
git blame - âWhat changed between v2 and v3?â â
git diff
The rule of thumb: if you need this information every session, put it higher (CLAUDE.md). If you only need it occasionally, put it lower (Git). The cost of getting it wrong: too high wastes context tokens, too low means you canât find it when you need it.
Is anyone actually researching this?
Yes. And not just one group.
A few worth tracking:
- Lore (arXiv 2603.15566) â restructures git commit messages into a knowledge protocol so AI can query âwhy was A chosen over Bâ from commit history
- Git Context Controller (arXiv 2508.00031) â organizes agent memory using git operations (commit, branch, merge), achieved 48% on SWE-Bench-Lite
- Letta Context Repositories (blog) â git-backed filesystem as shared agent memory, sub-agents work in isolated worktrees
- DiffMem (GitHub) â manages conversational AI memory as a git repo; current state in editable files, history in the commit graph
The common conclusion across all these systems: separating âcurrent stateâ from âhistorical recordâ is the right call, and Git is a natural fit for the latter.
Closing thought
Git was built in 2005 by Linus Torvalds to manage the Linux kernel. Twenty years later it picked up a new use case: helping a perpetually amnesiac AI remember what it did.
Torvalds probably didnât see that one coming.
But âcomplete history + precise queries + arbitrary restorationâ â those three properties happen to be exactly what AI agents need most. Maybe thatâs what good tools do: they solve one problem at design time, but a well-chosen abstraction gets rediscovered in contexts nobody anticipated.