The credit limit email
It arrived mid-session. Netlify had hit the free tier credit limit. The website — the one place where harness.os has a public face — was about to go dark.
I’ve been in this situation before with other projects. The usual response is a scramble: log in to the dashboard, figure out the billing, maybe upgrade the plan, maybe start a migration that takes the rest of the afternoon. Infrastructure emergencies have a way of derailing whatever you were actually trying to accomplish.
This time, the response was different. Within the same session that was already running, we deployed to Cloudflare Pages. New URL: harness-os.pages.dev. Both old Netlify sites deleted. Total elapsed time: under two minutes.
The reason this worked so cleanly is that the website is a static site — HTML, CSS, and JS files with no server-side rendering, no build pipeline dependencies, no platform-specific configuration baked into the code. The hosting layer was genuinely interchangeable. The deployment was a git push equivalent, not a migration.
That’s not an accident. It’s a design choice. And it only pays off on the day you actually need to move.
What actually happened in one session
The credit limit email was just the trigger. The session that followed turned into a broader infrastructure improvement pass — the kind of work that doesn’t produce a new feature but makes everything more solid underneath.
Here’s what got done:
| Action | Time | Impact |
|---|---|---|
| Deployed to Cloudflare Pages | ~2 min | New URL live: harness-os.pages.dev |
| Deleted both Netlify sites | ~1 min | Zero vendor lock-in, clean break |
| Restructured repo layout | ~5 min | Flat harness-os-mcp → clean harness-os/{mcp/, website/, blog/, docs/} |
| Configured project-level MCP | ~2 min | Harness MCP as Claude Code project connection |
Added artifacts column to handoffs |
~1 min | Sessions can now track created files |
Updated schema_reference |
~1 min | Future agents bootstrap with current schema |
Six actions, roughly twelve minutes of active work, zero downtime. None of these were planned. They emerged from a single trigger — a billing email — and the recognition that while we were touching infrastructure, we should make it better, not just fix the immediate problem.
The best infrastructure sessions aren’t planned. They’re triggered by friction, and the quality of the response depends entirely on whether your systems are portable enough to move when they need to.
Folder structure is architecture
The repo rename sounds cosmetic. It wasn’t.
The old structure was harness-os-mcp — a flat directory where the MCP server code sat alongside the website, the blog posts, and whatever documentation existed. The name itself told you what the project used to be: an MCP server. The website was bolted on. The blog was bolted on top of that.
The new structure is harness-os/ with explicit subdirectories:
harness-os/
mcp/ # MCP server code (tools, handlers, index)
website/ # Static site (landing page, styles, assets)
blog/ # Blog posts organized by series
docs/ # Documentation (future)
This communicates something fundamentally different. The project is a methodology that happens to include an MCP server, a website, a blog, and documentation. The MCP server is one component, not the whole identity.
Why this matters for agents
When a future agent session opens this repo, the folder structure is the first thing it reads. A flat directory with mixed concerns says “figure out what goes where.” A structured directory says “here’s what this project is and where things live.” That’s fewer orientation tokens, faster onboarding, less chance of an agent putting a blog post in the MCP directory.
The restructure was also surprisingly clean. The imports inside mcp/ didn’t change because they were already relative — ./tools/knowledge.js still resolves to the same file whether the parent directory is called harness-os-mcp or harness-os/mcp. The restructure was purely about how the project presents itself, not how it executes.
Folder structure is the cheapest architectural decision with the highest readability payoff. It costs minutes to change and saves orientation time on every future session that touches the repo.
The artifacts gap
Here’s a pattern I kept running into: a session would do excellent work — create files, modify schemas, deploy things — and then the handoff note would say something like “blog post written” without specifying where. The next session would have to search for the file, or worse, assume a location and end up saving a duplicate somewhere else.
This happened because the session_handoffs table had fields for summary, work_completed, and next_steps — all text — but nothing structured for tracking what was actually created. File paths buried in a prose summary are easy to miss and impossible to query.
The fix was a single column: artifacts JSONB on the session_handoffs table.
Now a handoff can include structured data like:
{
"files_created": [
"website/blog/series-02/06-token-economics-at-scale.html"
],
"schema_changes": [
"ALTER TABLE session_handoffs ADD COLUMN artifacts JSONB"
],
"deployments": [
{ "target": "cloudflare-pages", "url": "harness-os.pages.dev" }
]
}
This is a small schema change. One column, one migration. But the continuity improvement is significant. Future sessions don’t have to grep through prose to find what the previous session produced. They can query artifacts->'files_created' and get an exact list.
The compound effect
Every structured field you add to a handoff table reduces the ambiguity tax on the next session. Over dozens of sessions, the difference between “blog post written somewhere” and “file created at website/blog/series-02/06-token-economics-at-scale.html” is the difference between a session that starts working immediately and one that spends its first five minutes re-discovering context.
Unstructured handoffs leak information. If a session creates artifacts but doesn’t record their locations in a queryable format, the next session pays the cost of rediscovery — or worse, creates duplicates.
Gateway MCP: the next pattern
While restructuring the project, I kept thinking about a friction point that comes up in every multi-tool session: connection routing.
The current state
Right now, an agent that needs to work with the harness has two MCP connections to manage. There’s the Neon MCP for raw SQL access — querying tables, running migrations, reading schema. And there’s the harness MCP for structured tools — knowledge retrieval, session management, concern-based lookups. Two connections, each with their own capabilities, and the agent has to know which one to use for each operation.
This works fine when I’m the one directing traffic. But when a sub-agent spins up for a delegated task, it needs to be told explicitly: “use Neon for this, harness MCP for that.” That’s manual routing, and it doesn’t scale.
The gateway pattern
The idea is simple: one MCP endpoint that sits in front of everything else. Instead of connecting to multiple MCPs and routing manually, the agent connects to a single gateway and describes what it needs.
+-------------------+
| Gateway MCP |
| |
Agent ------> | discover(context) |
| route(operation) |
| |
+--------+----------+
|
+-------------+-------------+
| | |
+-----+----+ +----+-----+ +------+------+
| Harness | | Neon | | Future |
| MCP | | MCP | | MCPs |
| (tools) | | (SQL) | | (???) |
+----------+ +----------+ +-------------+
The key tool is discover(task_context). An agent says “I’m doing QA on way2fly” and the gateway returns: which tools are relevant, which knowledge chunks to load, which project IDs matter, which MCP to route each operation through. The agent doesn’t need to know the topology of the backend — it just needs to describe its task.
What this solves
- Sub-agent bootstrapping. Instead of embedding connection details and routing logic in every agent prompt, you give sub-agents one gateway URL and a task description. They self-configure.
- Knowledge routing. The gateway knows which concerns map to which tools. A QA task gets testing knowledge and the test runner. A deployment task gets infrastructure knowledge and the deploy tools. No manual mapping.
- Future-proofing. When a third MCP appears — say, a CI/CD MCP or a monitoring MCP — the gateway absorbs it. Existing agents don’t change. They still call
discover()and get back whatever tools are now relevant.
Open questions
The transport layer matters here. The current harness MCP runs over stdio, which works perfectly for a single Claude Code session but doesn’t support multiple concurrent connections. If two sub-agents need gateway access simultaneously, stdio won’t work — you need HTTP/SSE or a similar protocol.
There’s also the question of whether this should be a single monolithic gateway or a discovery layer that points agents to the right MCP without proxying through. A thin discovery service is simpler but adds a hop. A full gateway is more powerful but becomes a single point of failure.
The gateway pattern isn’t about adding a layer — it’s about removing routing decisions from agent prompts. Every routing instruction embedded in a prompt is a token cost and a maintenance burden. A gateway externalizes that logic into infrastructure where it belongs.
Continuous improvement isn’t a sprint
This session didn’t build a new feature. There’s no new tool, no new page, no new capability that a user would notice. What it did was make the existing infrastructure more portable, more organized, and more self-aware.
The website can now move between hosting providers in minutes instead of hours. The repo structure tells future agents what the project is instead of making them guess. Session handoffs carry structured artifact data instead of burying file paths in prose. And there’s a design pattern on the table for the next major infrastructure improvement.
This is what continuous improvement actually looks like. Not dramatic rewrites. Not weekend hackathons that produce shiny new features. Just steady, unglamorous work that makes the next session slightly better than the last one. Over time, those increments compound. A repo that’s easier to navigate saves minutes per session. A handoff that carries artifact locations prevents duplicate work. A hosting layer that’s vendor-agnostic turns a billing emergency into a two-minute migration.
None of these improvements are exciting in isolation. But the system that results from hundreds of them — that system is qualitatively different from one that was built in a sprint and left to calcify.