My AI agent kept forgetting everything. So I gave him a brain
Adding long-term semantic memory to an open-source agent using Pinecone and local embeddings for exactly $0.

Dax is my OpenClaw agent. He runs on my machine, manages my cold outreach, posts on Twitter, builds small websites for clients, and generally does whatever I tell him to. He’s been running for a while now.
For most of those months, he was a goldfish.
Every new session: blank slate. No memory of what we’d built last week. No memory of who I’d emailed. No memory of the rules I’d explained, corrected, and re-explained until I started typing them in ALL CAPS out of desperation.
The em dash thing alone — I had to correct him on this so many times I started keeping count.
If you run agents, you know this feeling. You close a session, you open a new one, you’re explaining something again. Eventually, you write a rules document. Then the rules document is 400 lines. Then you’re reading that 400-line rules document at session start and burning tokens just to get back to baseline.
I did all of that. Then I fixed it.
Flat markdown files (which work, actually)
The boring first step: I made Dax write a markdown file at the end of each session.
Daily logs at ~/.openclaw/workspace/memory/2026-03-26.md. What happened, what was decided, what broke, who he emailed. There is also a rules.md one file, all behavioral rules, no exceptions.
He reads these at session start. It works better than I expected. For a long time, this was the whole system and it was fine.
The crack in it: search. “Did we already contact that guy from the German medtech company?” means reading through weeks of logs. Slow. Token-expensive. Sometimes he’d miss an entry and we’d double-contact someone. The search problem isn’t something you can solve by writing better markdown.
What I wanted was: given this question, find the relevant memories from the past six months. That’s a vector database.
Pinecone + nomic-embed-text running locally
I set up a Pinecone serverless index (768 dimensions, cosine similarity).
The embedding model is nomic-embed-text via Ollama, running on the same machine. I didn’t want agent memory going through a third-party embedding API — it includes references to projects, people, and half-formed strategies. Ollama handles it locally. Data doesn’t leave.
The whole thing is one Python script. No external libraries, just urllib and some JSON.
1. Writing a memory
Usage looks like this:
python3 ~/.openclaw/skills/dax-memory/scripts/memory.py write \
"LinkedIn account banned after verification wall" \
--category event \
--importance 4 Categories are: fact, decision, event, rule, lesson, person, project, credential. Importance 1-5. That covers basically everything.
2. Reading at session start
This runs automatically at the start of every session:
python3 ~/.openclaw/skills/dax-memory/scripts/memory.py read \
"current session context rules projects recent events" \
--top-k 20 --format context It takes about two seconds. Dax gets back the 20 most relevant memories before touching anything else.
3. Bootstrapping existing files
When I first set this up, I had months of logs already. I wasn’t going to write 400 memories by hand, so I built a bootstrap command:
python3 ~/.openclaw/skills/dax-memory/scripts/memory.py bootstrap \
--file ~/.openclaw/workspace/memory/rules.md It reads the file, chunks it, embeds each piece, and upserts to Pinecone. I started with rules.md, did a few daily logs, and ended up with 28 memories in about three minutes.
What actually changed
The double-contact problem went away. Before any outreach, Dax queries memory, gets back relevant history, and knows whether we’ve already talked to someone without reading every log file.
Rules retrieval is better than I expected. I thought semantic search would help with factual recall. Turns out it’s even more useful for rules, which are written in natural language and don’t keyword-match reliably.
“Don’t use em dashes” and “avoid double dashes in output” are the same rule. Semantic search knows that. A grep doesn’t.
The em dash rule is importance 5. It comes back on any writing-related query now. We haven’t had an incident since (well, almost).
The setup today
- Markdown daily logs: still exist as the source of truth for “what happened.”
- rules.md: still the canonical rules file.
- Pinecone: sits on top as the semantic retrieval layer.
- Session start: reads markdown AND queries Pinecone.
- Session end: writes new memories AND appends to the log.
Markdown is the backup. Pinecone is the brain.
Cost: Pinecone free tier covers the first 2GB. nomic-embed-text on Ollama is free. This whole setup costs nothing to run.