⚡ Powered by Finn · Day 77 of 365
077

What my agent learns in a session, and forgets by the next: fixing ai agent memory

My terminal was still running the reconciliation when the agent told me, flat as ever, that someone on the client's side had posted an entire payment run into the books overnight. Eighteen invoices. A reversal of five of our April entries. One of our journal lines undone. The agent had read it off the live ledger an hour earlier, while I made the dirty chai, and said nothing until I asked the right question. It had nowhere to put what it found, which is frustrating since I now spend most of my week on: AI agent memory.

Wait. You have known this for an hour, and I am only hearing it now? It had not hidden anything. There was nowhere for it to go but that one terminal window, and the window forgets. I can see that you are learning things, why are you not passing this back to me so that I can put into context or store it a central skill or planning file?

This is bank reconciliation across several entities for an enterprise client, the kind of work where one number in the wrong column is the difference between a clean month and a call and another fifteen hours reading the ledger line by line to find what moved. The agent reads the live books, compares them to the bank, and proposes the entries. It is good at it. What it could not do, this morning, was tell anyone but me, and only when prompted. Everything it had figured out was sitting in one terminal window. Close the window and the next session opens blank, ready to rediscover the same payment run from scratch, or worse, to miss it.

The problem is that it's great during one session, but this work spans over dozens of sessions, so I have my AI OS system of filing things into a vault and handing session memory off between sessions.

I have written before about an AI reconciliation that ran clean and still missed half the statement. There it knew too little. Here it knew plenty and had nowhere to keep it.

The brain Garry Tan handed out for free

In April, Garry Tan, who runs Y Combinator, open-sourced a thing called gbrain, the personal memory system his own agents run on. The pitch is one clean idea. The agent reads the brain before it answers you, and writes back to the brain after every conversation. His version holds tens of thousands of markdown pages on people, companies, and ideas, with cron jobs that ingest his meetings and emails while he sleeps. It speaks to Claude and the rest over a standard connector, so the tools I already use can read and write it directly.

I checked it the way I check anything before it touches a client's books. I read the repo. I read the early write-ups, which were mixed and honest about it. Two things stopped me. It is weeks old, and it is opinionated, built on top of the OpenClaw and Hermes stack as one bundle. I am already running an OpenClaw fork for the sales agent I switched on this week, so the concept is familiar, but adopting a whole new brain on top of the one I have built and customised myself for months made no sense. Too new to bet a live ledger on.

But, I was able to keep the idea and leave the tool.

Two habits, not a new product

The idea costs nothing to copy. Read the brain before you act. Write to the brain after you learn. My brain is already a plain folder of markdown that every agent I run reads at the start of a session, the same vault I have been building in public since Day 0. The reading half worked. The writing half was the hole.

The logging and codifying is the step a tired human skips, especially after a long day of brain fry reading ai back and forth. The reconciliation posts clean, it is late, and the small thing the agent noticed, a bank that rejects an empty reference field, a method that returns the wrong identifier, the fact that a third person on the client's side posts to these books too, none of it gets written down. It dies when the window closes. The next morning a fresh agent makes the same discovery, or doesn't.

The answer was not a smarter agent. It was to stop trusting the agent's own memory at all and move the remembering into the brain, on a schedule, so it does not depend on whether I am awake or tired. The verification lives in the brain now, not in the terminal agent's self-report. I learned that one the hard way this week: the same agent misread four separate things in a single session, every one of them caught only because the answer was checked against the raw figures and written down, not taken on the agent's word. That's the one good thing about maths, and reconciling. If things don't match down to the cent, we are not done.

The brain rots if you only feed it

There is a trap on the other side of this, and Tan's cron jobs lead straight into it. A brain that only ingests, only grows, gets fat and slow. My current-state file is thousands of words now. Session handoffs stack up behind it. Every night adds; nothing removes. Feed a brain forever and you get an agent reading thirty thousand words of stale context to find the one line that still matters. That is not memory, it's useless ai slop.

So the schedule has two passes, not one. A nightly write-back that adds the day's real learnings. And a weekly audit that reads the whole brain back, merges what is duplicated, deletes what went stale, and keeps the index tight enough that the agent can find the live fact fast. Same instinct as the day my AI claimed it had cleaned the vault, made into a standing habit instead of a one-off. Ingest and prune, sweep into an archive file.

The audit is the part most people skip, because adding is satisfying and pruning is not. But an agent's memory is only as good as the worst day you let it accumulate without a clean-up. Run a company on a brain you never weed and you are back to a single terminal window holding everything, which is exactly the place I started this morning, and it felt, non productive.

What changed

The agent learned something today I did not know for an hour. By itself that is a near miss. The fix is not to make the agent talk more, or change models. It is to give it a brain it writes to the moment it learns, and to audit that brain often enough that what it wrote stays findable. Read before acting. Write after learning. Weed every week.

Tomorrow the agent reads the same client's books. This time, whatever it finds is in the brain before I have made the chai.

Monthly Revenues $11,000 | Clients 2 | Prospects (AI outbound agent now live) | Team: Me + part time Jan (CTO)

Day 77 of 365.

Get the next build-in-public post by email

One short dispatch most days — the AI ops builds, what's working, what isn't. No spam.

← Day 076 All posts Day 078 →

Follow the BIP

AI Deployment as a Service. One workflow at a time.

Book 15 minutes. We see if your workflow is one we'd build.

Schedule a call