Day 32/365: Madrid to Geneva, Speccing a Personal AI Assistant

Plane back to Switzerland. Window seat, laptop open, no wifi. Four days in Madrid with Somers, lots of good meals, the Cercanías back from Madrid centre to Barajas, and now a couple of quiet hours over France while my brain finally has nothing to do but think. Madrid felt different this time. Somers is closing out the same chapter I closed at his age. The friends, the late dinners, the things I thought were just life turned out to be most of what I kept.

The enterprise client's founder has been asking about a personal AI assistant. He's mentioned it twice. I do not fully know what he means by it, which is exactly why I have decided to start by building one for myself first.

Not being heavily technical, the cleanest way I know how to figure something out is to build the smallest version of it for myself and then port it over. My first real AI team member, sitting next to my coach skill and the security skill that scrubs everything I ship to a client.

Loose plan, drafted on the back of an Iberia napkin and now pasted onto this screen.

The build, in one sentence: I'm not trying to build a chatbot. I'm building a secure, persistent, model-agnostic personal chief-of-staff assistant that can access selected systems, remember context over time, draft in my voice, and help me execute reliably without me having to repeat myself.

In my mind, I have an example task something that has been annoyingly difficult for me to coordinate. Draft a team meeting for 'grantees name' with Chase, the grantee, Max, Brenda, Leo, etc. This is something that is simple enough for me to do, but that I also don't do it. I'm imaginging now to myself, speaking in to my iPhone mic like sending a message to a human personal assistant. Please send out a calendar invite to person 1, 2 and 3, for 15:00 CET. Maybe even, check with the people to see if that time is ok. If I can create something that could do this task and not be annoying or chatbot like, it's a win. For both of us.

That goal is the spec. The rest is just unpacking it.

The environment

Step one is the box, not the code.

A Mac mini. A new one. Dedicated to this assistant, not running on my main machine. Whatever permissions I end up granting the agent (files, mail, calendar, contracts), I want them in a sandbox. I do not want an agent crawling everything on the same disk where my client work lives. That is not really a model risk. That is an operator-discipline risk, and the way to fix it is hardware separation. I'm thinking this mainly because I don't really trust myself to be able to do this in a secure way, and some leak or hack in to the ai could get me trouble fast, and I'd just as soon have that not be an issue.

I have heard good things about the Mac mini from people who use them for exactly this. I love Mac. I have a loose grasp of UNIX file systems, I'm comfortable in the command line if I have to be, and the Claude CLI has made the whole cd / ls / bash register approachable for anyone willing to spend an afternoon with it. It is no longer a barrier. It used to be. It is not.

Down the line there is a service version of this for clients. Their own PA, sandboxed, private keys, no central server. A friend has built a hash-encrypted on-chain AI companion along these lines, and the copy on his sales page sells the value prop better than I could. If I get the personal version right, the client version is mostly packaging.

The model

I have been hearing about Kimi K2, an open-source model getting decent reviews, lighter on token usage. Without wifi I cannot check the details, but I am near-certain the team behind it is Chinese, which means I run my usual filter on it. Anything from China comes with an agenda that may not align with mine. I am also realistic about what I'm actually feeding this assistant. I write daily in public about everything I am doing. I am not running anything nefarious. I can probably afford to test Kimi as a model.

The point is that the model has to be replaceable. One variable swap. If a better model lands next month, I want it slotted in by Tuesday. The architecture should not be model-specific.

openclaw

If you have been on LinkedIn for more than an hour this month you have heard about openclaw. Open-source on GitHub, heavily starred, real ecosystem, skills marketplace. Tagline: "Your personal, open source AI assistant. Any OS. Any platform. The lobster way." It is the open-source version of exactly what I am sketching on this napkin, and thousands of people are already shipping on top of it.

Two paths. Fork it, learn it cold, layer my own opinionated stack and the chief-of-staff thesis on top. Or build the smallest possible version that meets the spec and treat openclaw as the reference architecture I read on Sundays.

Either way the answer to "what do you think of openclaw" in three months cannot be "I read about it on LinkedIn." If I am going to be a credible fractional chief AI officer for the kind of buyers I want, that question is settled regardless of which path I pick. I'll know openclaw cold by the end of May.

What it should do

Bar is high. Smarter than me, never on holiday, and currently running on something close to Opus 4.7 intelligence, which on a good day is sharper than mine and on a bad day is much sharper. If I'm paying for that kind of leverage I should set the requirements at "as good as a human or better."

A few things from a human-assistant brief that I want to carry over.

Persistent state. If I have to remind it twice about a client's situation, it has degraded into an annoying chatbot and I will quietly stop using it. The whole point is that it knows what is going on. The vault I have already built can do most of the work here. Files, projects, decisions, daily logs, all already structured.

Voice. A month of daily public writing has trained a voice model on me whether I asked for it or not. Plug that in. For other people who might use a service version later, generic onboarding gets them started while it learns them.

Service access. Email, calendar, contacts, at minimum. The assistant has to be in the loop on what is moving, not standing outside it.

I will probably fold my one thing ai coach into this thing too. The check-in cycle and daily review are already shipped. One agent. Multiple skills.

Now that I think about this, it should have a name. Amy or Zoe seems about right.

The four choices

To get from "loose plan" to a build, four pivots need answers.

Which systems get connected first. Gmail, Google Calendar, Drive, Notion, the vault, contacts. The order matters. Calendar plus mail probably go first because that is where most days actually live. Drive and the Obsidian vault next, because that is where the work-product hides. Contacts last, because handing an agent a contact list is the single most leverageable and most dangerous action in the stack.

What gets to run without approval. Some actions never (sending money, sending email to outside parties, deleting anything). Some maybe (drafting a reply, scheduling a hold, filing a doc to the right project folder, summarising a thread). The line between "maybe" and "never" is the actual work, and it is mostly a conversation with the principal, not a coding task. For me, probably starting with a few trusted contacts to see how it works, then opening to a wider audience.

Local-first, cloud, or hybrid. For me it's hybrid leaning local: vault and files on the Mac mini, model inference in the cloud, secrets handled through the keys-only-the-user-holds pattern a friend has already built into a hash-encrypted on-chain companion. For a regulated client it might be local-first, hard stop. For a small founder with nothing sensitive it might be all cloud and let it rip. The deployment shape follows the data, not the other way round. For my enterprise client, I already know he is 100% on the iPad, so that'll need to be cloud only so that has to be an option eventually. His data is very sensitive, so the hash encrypted where he has the keys is the path.

One executive, or a repeatable service. If it's one, I optimise for that human's quirks: their language, their inbox patterns, their meeting cadence. If it's a service, the architecture has to abstract the human-specific layer behind a clean onboarding pass so a new principal can be live inside an afternoon. I'm building one. The honest version is: one with the second use already in mind.

The hire

Here is the second-order play. This same build is the test project for my first developer hire.

The shortlist looks like a recent computer-science graduate who actually knows what they are doing on the security side, who has spun up an openclaw agent before, and who can take a brief and ship. I have my service priced as one task at a time. When something lands that I can't handle because I'm either too busy or this person is a better fit for the job, I send it to them. If they are quick across a wide range of work I have a candidate for a real role and I can get back into sales and marketing.

Hand me back four hours a day and the pipeline starts moving again.

The drift

I ran this whole thing past my ai coach before I started writing it up. The coach called it.

"Project drift. How is this moving your One Thing forward, which right now is increasing MRR and delivering world-class service to the clients you have? This assistant is loosely a year-end goal for the client. It is not in the queue. Why are we doing it now?"

Fair. The override goes like this. It makes me more efficient, which protects delivery. It is the same client's stated wish, so the work doubles. It is a clean way to test a developer I might end up needing anyway.

Sometimes you override the coach. The point of having one is that the override is a conscious choice, not a drift that it calls you out on. Most of the time, I agree with the drift and I'm appreciative of the call out. Today I overrode. I might be wrong about that.

First step is buying the Mac mini.

Monthly Revenues $11,800 | Clients 2 | Prospects 1 | Employees 1, for now...

Day 32 of 365.

Madrid to Geneva, Speccing a Personal AI Assistant