MCP Server Reference
The harness.os MCP server is the Tier 2 implementation of the methodology. A Python MCP server that exposes the CNS schema as tools, enabling any MCP-compatible client to act as an inner harness connecting to the outer harness.
Core Principle: Connecting IS Participating
The outer harness enforces behavior on any client that connects. The client does not choose to participate in session tracking, event logging, or guardrails -- connecting IS participating.
This means Claude Code connecting to harness-os MCP gets the same tracking as build.ai's agent pipeline connecting to the same MCP. The outer harness is the same, so the process enforcement is the same. This is the proof that the inner harness is interchangeable.
If tracking lived in client-side hooks (Claude Code settings, Copilot extensions, custom agent config), swapping to a different inner harness would lose tracking. With tracking in the outer harness (MCP server), any client that connects gets it automatically.
Scale 1: Achieving This With Files
Not everyone has MCP. If you are using Copilot, Cursor, or any AI tool that reads project files, you can implement the same harness.os principles with markdown files and conventions. No server, no database.
File structure
your-project/
CLAUDE.md # or .copilot-instructions.md, or AGENTS.md
.claude/rules/ # or .github/copilot-instructions/
coding-standards.md
testing.md
architecture.md
docs/
decisions/
001-state-management.md
002-database-choice.md
domain/
entities.md
glossary.md
specs/
feature-a.md
CHANGELOG.md # manual session log
How the four harness types map to files
| Harness Type | File Location | Content |
|---|---|---|
| Build | CLAUDE.md + .claude/rules/ | Coding standards, workflow rules, architecture constraints |
| Product | docs/specs/ + docs/decisions/ | Feature specs, ADRs, roadmap |
| Operations | docs/domain/ | Domain knowledge, process descriptions, terminology |
| Domain | domain/ or app database | Structured domain data — can be YAML/JSON files or a database |
Domain data as files
The domain harness does not require a database. For small teams or solo developers, structured files work well for tracking domain data that would otherwise live in a database.
domain/
time-tracking/
marco/
way2fly.yaml # hours per project per person
way2move.yaml
pedro/
lakedeck.yaml
project-health/
way2fly.yaml # status, blockers, last deploy
way2move.yaml
team/
contributors.yaml # who works on what, availability
Each file follows a defined schema:
YAML# domain/time-tracking/marco/way2fly.yaml
project: way2fly
contributor: marco
entries:
- date: 2026-05-05
hours: 3.5
activity: feature/jump-logbook
- date: 2026-05-06
hours: 2.0
activity: bugfix/voice-recording
The rules are the same as database-backed domain harnesses: define the schema (what fields, what types), use predictable paths (one file per entity), and let the agent read and write them. Git provides version history and merge conflict handling for multi-contributor scenarios.
File-based domain data works when last-write-wins is acceptable (solo dev, small team), you don't need real-time queries across entities, and the data volume fits comfortably in a directory tree. Move to a database when you need concurrent writes, aggregation queries, or the data grows past what files handle well.
How to enforce behavior with files
Files alone cannot force behavior the way an MCP server can. But some AI tools provide hooks that get you closer:
What works without MCP
- Instruction files:
CLAUDE.md,.cursorrules,.github/copilot-instructions.md— rules the agent reads at session start. Write in imperative voice: "Always use X. Never use Y." - Decision records:
docs/decisions/001-state-management.md— "Decision: Riverpod. Rationale: testability. Alternatives rejected: Bloc." Prevents the AI from re-suggesting rejected alternatives. - Claude Code hooks (
.claude/hooks.json): Shell commands that run before/after specific tool calls. Can validate outputs, log actions, even reject operations. This is the closest you get to guardrails without MCP. - Claude Code skills (
.claude/skills/): Structured instruction files the agent loads on demand. Like lightweight tools — but they're instructions, not enforced interfaces. - Session log: Manual
CHANGELOG.mdorSESSION_LOG.md. The manual version ofstart_session/end_session.
What does NOT work without MCP
- Universal interceptor: Hooks are Claude Code-specific. Cursor, Copilot, Windsurf, and other agents don't have them. MCP works with any agent that supports the protocol.
- Mediated access: Without MCP, the agent reads and writes files directly. It can skip validation, ignore rules, or write malformed data. MCP puts a server between the agent and the data.
- Automatic session tracking: No hook can reliably track every action across an entire session. MCP's
log_tool_callinterceptor wraps every tool call automatically. - Cross-agent consistency: Each AI tool has its own hook/instruction system. MCP is the standard — one server works for Claude Code, Cursor, custom agents, or any MCP client.
Always use MCP when you can. The only choice is what storage backend sits behind it — files or database. MCP is the universal adapter layer. Hooks and instruction files are useful supplements, not replacements.
Limitations at Scale 1 (files without MCP)
- No automatic tracking. You must manually log sessions and decisions.
- No compound learning. Insights do not accumulate automatically across sessions.
- No cross-project queries. You cannot ask "what did I learn about testing across all projects?"
- No enforcement. The AI reads the files but nothing prevents it from ignoring them. There is no interceptor.
- No mesh. Each project is isolated. Cross-domain reasoning requires manual context.
- Agent-specific. Claude Code hooks don't work in Cursor. Cursor rules don't work in Copilot. Each tool has its own conventions.
Scale 1 + MCP: File-Backed Tools
Scale 1 has a key weakness: the agent reads files directly. It can ignore rules, forget to log decisions, or write data in the wrong format. There's no interceptor, no guardrails.
The fix: put an MCP server in front of the same files. The agent gets the same tools as Scale 2 (start_session, log_decision, get_rules, search_knowledge) but the backend reads and writes structured files instead of PostgreSQL.
Scale 1 (no MCP):
Agent --reads--> CLAUDE.md, .claude/rules/, docs/decisions/
(no mediation, no tracking, agent can ignore rules)
Scale 1 + MCP (file-backed tools):
Agent --calls--> MCP Server --reads/writes--> harness/ folder
(same tools as Scale 2, guardrails enforce behavior)
(agent never touches files directly)
Scale 2 (database-backed tools):
Agent --calls--> MCP Server --queries--> PostgreSQL
(same tools, same guardrails, full search + mesh)
How it works
The MCP server starts with a HARNESS_PATH instead of a DATABASE_URL. It expects a harness directory with a known structure:
harness/
rules/ # spine_rules equivalent
coding-standards.md # YAML frontmatter: slug, triggers, priority
testing.md
architecture.md
workflows/ # spine_workflows equivalent
feature-development.yaml # steps[], triggers[]
bug-fix.yaml
knowledge/ # cortex_chunks equivalent
dev-workflow/
git-conventions.md
ci-pipeline.md
design/
color-system.md
typography.md
decisions/ # decisions table equivalent
001-state-management.yaml
002-database-choice.yaml
learnings/ # learnings table equivalent
testing-patterns.yaml
performance-traps.yaml
sessions/ # session_handoffs equivalent
latest-handoff.yaml # written by end_session
log/ # session history
2026-05-07T14-30.yaml
domain/ # domain data — structured files
time-tracking/
marco/way2fly.yaml
project-health/
way2fly.yaml
Same tools, same workflow
The agent calls the exact same tools. The implementation is different — file I/O instead of SQL — but the interface is identical:
| Tool call | File-backed implementation |
|---|---|
start_session(project) | Reads sessions/latest-handoff.yaml, scans rules/ for matching triggers, returns rules + handoff |
end_session(summary, ...) | Writes sessions/latest-handoff.yaml, appends to sessions/log/ |
get_rules(context) | Scans rules/*.md frontmatter for matching triggers[], returns content |
log_decision(title, rationale) | Appends numbered YAML to decisions/ |
log_learning(title, content) | Appends YAML to learnings/ |
search_knowledge(query) | Keyword search across knowledge/**/*.md content |
get_knowledge_by_domain(domain) | Reads all files in knowledge/{domain}/ |
Rule files with frontmatter
Each rule file has YAML frontmatter that the MCP server reads to match triggers:
harness/rules/testing.md---
slug: testing-standards
name: Testing Standards
triggers: [testing, tdd, unit-test, integration-test]
priority: 10
---
Always write the test first, then the implementation.
Every feature must have tests before it ships.
Never hit real databases in unit tests — use fakes or the emulator.
When the agent calls get_rules("testing"), the MCP server scans all files in rules/, matches the frontmatter triggers array, and returns the full content of matching rules — just like the database-backed version queries spine_rules WHERE triggers && ARRAY['testing'].
Guardrails still work
The critical difference between Scale 1 and Scale 1 + MCP: the agent never touches files directly. Every read and write goes through MCP tools, which means:
- Session tracking: The interceptor logs every tool call to
sessions/log/ - Validation: The server can reject malformed writes (missing required fields, wrong data types)
- Guardrails: Post-tool hooks fire the same way —
log_decisionauto-creates a decision event - Instructions: The MCP
instructionsfield tells the agent the session lifecycle — start, work, end
When you outgrow files, swap one environment variable: change HARNESS_PATH=/path/to/harness to DATABASE_URL=postgresql://.... The agent's config doesn't change. The tools don't change. The CLAUDE.md doesn't change. The storage backend is the only thing that moves.
Setup
Option A: Harness inside the project
The harness folder lives in the repo. It gets version-controlled with the project. Good when the harness is project-specific.
Bash# 1. Create the harness directory in your project
mkdir -p harness/{rules,workflows,knowledge,decisions,learnings,sessions/log,domain}
# 2. Add your first rule
cat > harness/rules/coding-standards.md << 'EOF'
---
slug: coding-standards
name: Coding Standards
triggers: [coding, implementation, refactoring]
priority: 10
---
Use TypeScript strict mode. No `any` types.
Prefer composition over inheritance.
EOF
# 3. Configure MCP (per-project .mcp.json)
cat > .mcp.json << 'EOF'
{
"mcpServers": {
"harness": {
"command": "python",
"args": ["/path/to/harness-os-mcp/server.py"],
"env": {
"HARNESS_PATH": "./harness",
"PROJECT_SLUG": "my-project"
}
}
}
}
EOF
Option B: Harness on the machine (shared across projects)
The harness folder lives outside any project — on the machine itself. Every project on this machine connects to the same harness. Good for shared build rules, personal knowledge, or a team's coding standards.
Bash# 1. Create a machine-wide harness
mkdir -p ~/.harness/build/{rules,workflows,knowledge,decisions,learnings,sessions/log}
# 2. Add shared rules (apply to all projects on this machine)
cat > ~/.harness/build/rules/testing.md << 'EOF'
---
slug: testing-standards
name: Testing Standards
triggers: [testing, tdd, unit-test, integration-test]
priority: 10
---
Always write the test first. TDD is not optional.
Never mock what you don't own.
EOF
~/.claude/settings.json (machine-wide config){
"mcpServers": {
"build-harness": {
"command": "python",
"args": ["/opt/harness-os-mcp/server.py"],
"env": {
"HARNESS_PATH": "~/.harness/build",
"HARNESS_ID": "build-harness"
}
}
}
}
Now every Claude Code session on this machine gets the build harness automatically. No per-project config. No .mcp.json in each repo. The agent calls start_session and gets your shared rules. Open a new project, the harness is already there.
Combining both
You can run both. Machine-wide harness for shared rules (build standards, testing patterns) plus a per-project harness for project-specific knowledge (domain data, product decisions):
Machine-wide (~/.claude/settings.json)// Shared build harness — every project gets this
"build-harness": { "env": { "HARNESS_PATH": "~/.harness/build" } }
Per-project (.mcp.json)// Project-specific product harness — only this project
"product-harness": { "env": { "HARNESS_PATH": "./harness", "PROJECT_SLUG": "way2fly" } }
The agent sees both harness instances. It gets build rules from the machine-wide harness and product context from the project harness. This is the file-based equivalent of the mesh — two harness instances, each with their own knowledge, connected through the same agent session.
Move to Scale 2 (database) when: You have 3+ projects and need cross-project queries, you want semantic search over knowledge, or you need the mesh to coordinate multiple harness instances. The upgrade is one environment variable: swap HARNESS_PATH for DATABASE_URL.
Scale 2: Database-Backed MCP Server
At Scale 2, the storage backend is PostgreSQL. Same MCP tools, same guardrails, but with full SQL queries, semantic search via pgvector, and cross-project mesh connectivity.
Scale Comparison
| Concern | Scale 1 (files, no MCP) | Scale 1 + MCP (files) | Scale 2 (database) |
|---|---|---|---|
| Agent interface | Reads files directly | MCP tools | MCP tools (same) |
| Session tracking | Manual changelog | Automatic (file-based log) | Automatic (database) |
| Decision logging | Write to docs/decisions/ | log_decision → YAML file | log_decision → SQL row |
| Rules enforcement | AI reads file, may ignore | Interceptor mediates access | Interceptor mediates access |
| Knowledge search | Manual file browsing | Keyword search over files | Semantic search (pgvector) |
| Cross-project | Copy-paste between repos | Shared harness folder | SQL queries across projects |
| Learning accumulation | Manual notes | log_learning → YAML | log_learning + transferability |
| Mesh connectivity | None | None | Full mesh events + transactions |
| Setup required | None | MCP server + directory | MCP server + PostgreSQL |
Future: Scale 3+
Scale 3 (Remote MCP) and Scale 4 (Federated) are designed but not built yet. They add:
- Remote MCP via Streamable HTTP -- same server, accessible over network
- Authentication and authorization -- JWT + RBAC per harness instance
- Multi-tenant isolation -- per-tenant Neon branches
- Federated learning sync -- high-transferability learnings published across meshes
Architecture
MCP Client (inner harness) MCP Server (outer harness) Neon PostgreSQL
Claude Code / Copilot / API --> server.py (Python, asyncio) --> branch per harness
|
+-- tools/state.py (projects, sessions, roadmap)
+-- tools/spine.py (rules, workflows, prompts)
+-- tools/cortex.py (knowledge, search, embeddings)
+-- tools/learnings.py (accumulated insights)
+-- tools/agents.py (agent registry)
+-- tools/health.py (harness health checks)
+-- tools/events.py (mesh event stream)
+-- tools/transactions.py (cross-harness operations)
+-- tools/mesh.py (mesh topology, instances)
+-- tools/concerns.py (cross-cutting concern queries)
+-- tools/tracking.py (session/event queries)
+-- tools/logging.py (auto-tracking interceptor)
+-- tools/guardrails.py (post-tool event hooks)
The server is a thin Python layer over standard PostgreSQL. If MCP evolves, the adapter changes -- the schema, data, and knowledge do not.
Enforcement Mechanisms
Four mechanisms make "connecting IS participating" work:
1. Server-side session IDs
The server generates a UUID per MCP connection. The client never provides a session ID -- the server creates it. Every tool call within that connection shares the same session ID.
Python# server.py -- generated once per connection (one process = one MCP connection in stdio)
_connection_session_id: str = str(uuid.uuid4())
This means tracking works identically for Claude Code, Copilot, or any MCP client. The client does not need to know about sessions.
2. Tool call interceptor (tools/logging.py)
Every tool handler is wrapped with log_tool_call(). This is the single choke point -- all tool calls flow through it. No tool can bypass it.
The interceptor does three things on every call:
- Auto-creates a session on the first tool call (upserts
claude_sessionsrow) - Records the tool event (inserts into
claude_session_eventswith params, duration, status) - Increments the tool call counter on the session
Python# server.py -- every handler gets wrapped
HANDLERS = {
**{t.name: log_tool_call(state.handle, get_session_id=get_session_id) for t in state.TOOLS},
**{t.name: log_tool_call(spine.handle, get_session_id=get_session_id) for t in spine.TOOLS},
# ... every module gets the same wrapping
}
3. Connection lifecycle (session end)
When the MCP connection closes (stdin EOF), server.py calls end_tracking_session() in a finally block. The session is marked as completed with ended_at timestamp. The client does not need to call anything.
Pythonasync def main():
await get_pool()
try:
async with stdio_server() as (read_stream, write_stream):
await server.run(read_stream, write_stream, server.create_initialization_options())
finally:
await end_tracking_session(_connection_session_id)
await close_pool()
4. Post-tool guardrails (tools/guardrails.py)
Specific tool calls trigger additional automatic events. For example, calling log_decision automatically emits a decision event. Calling end_session automatically emits a session_end event. These run non-blocking after the tool call succeeds.
Python_POST_HOOKS: dict[str, str] = {
"log_decision": "decision",
"end_session": "session_end",
"start_session": "session_start",
}
What MCP enables vs what it does not
| Server can enforce | Server cannot enforce |
|---|---|
| Auto-session creation/teardown (no client cooperation needed) | Forcing the client to call specific tools |
| Universal pre/post interception of every tool call | Preventing the client from ignoring tool responses |
| Rejecting tool calls that violate preconditions | Guaranteeing the client reads instructions |
| Identifying the client (name/version from init handshake) | Cross-server coordination |
| Injecting context into every response | -- |
Tool Categories
State Tools (tools/state.py)
Project state management, roadmap tracking, session lifecycle.
| Tool | Description |
|---|---|
list_projects | List registered projects, optionally filtered by mode (work/life) |
get_project_state | Current status: phase, summary, in-flight work, blockers |
update_project_state | Update state; creates if none exists |
get_roadmap | Ordered roadmap items, optionally filtered by status |
add_roadmap_item | Add an item to a project's roadmap |
update_roadmap_item | Update status/notes on a roadmap item |
start_session | Begin a harness session -- loads last handoff + rules |
end_session | End session -- persists decisions and handoff summary |
get_session | Retrieve a specific session by ID |
list_sessions | List recent sessions for a project |
get_decisions | Get recent decisions for a project |
log_decision | Record a decision with rationale |
Spine Tools (tools/spine.py)
Rules engine, workflow management, prompt library.
| Tool | Description |
|---|---|
get_rules | Get rules matching a trigger context (e.g., "testing", "deployment"); supports concern filtering |
get_workflow | Get a workflow by slug or by matching trigger context; supports concern filtering |
get_prompt | Get a system prompt by slug or purpose |
add_rule | Create or update a rule with triggers[] and conditions |
add_workflow | Create or update a workflow with steps JSONB |
add_prompt | Create or update a system prompt |
Cortex Tools (tools/cortex.py)
Knowledge storage, semantic search, domain discovery.
| Tool | Description |
|---|---|
list_domains | List all knowledge domains with chunk counts |
search_knowledge | Semantic search across knowledge chunks (uses pgvector VECTOR(1536)) |
get_chunk | Retrieve a specific knowledge chunk by ID |
add_knowledge | Store a knowledge chunk with domain, tags, and optional embedding |
bulk_insert | Batch insert multiple knowledge chunks |
Learnings Tools (tools/learnings.py)
Accumulated insights with transferability scoring.
| Tool | Description |
|---|---|
log_learning | Record a learning with category, insight, context, and transferability_score |
search_learnings | Search learnings by category, domain, or keyword |
get_transferable_learnings | Get learnings above a transferability threshold (for cross-mesh flow) |
Agent Tools (tools/agents.py)
Agent registry and capability management.
| Tool | Description |
|---|---|
list_agents | List registered agents with capabilities and status |
get_agent | Get agent details including implementations and knowledge |
register_agent | Register a new agent with type, capabilities, and model preference |
Health Tools (tools/health.py)
Harness diagnostics and status.
| Tool | Description |
|---|---|
harness_health | Overall health check -- table counts, latest activity timestamps |
schema_info | List all tables and their row counts in the current harness |
Event Tools (tools/events.py)
Mesh event stream for observability.
| Tool | Description |
|---|---|
emit_event | Emit a mesh event (event_type, payload JSONB) |
list_events | Query recent mesh events, optionally filtered by type |
Transaction Tools (tools/transactions.py)
Cross-harness operation tracking.
| Tool | Description |
|---|---|
start_transaction | Begin a multi-step cross-harness transaction |
add_transaction_step | Record a step in a running transaction |
complete_transaction | Finalize a transaction with duration and status |
get_transaction | Retrieve a transaction by ID |
list_transactions | List recent transactions |
Mesh Tools (tools/mesh.py)
Mesh topology and instance management.
| Tool | Description |
|---|---|
list_harness_instances | List registered harness instances with types and connection info |
get_mesh_topology | Get the full mesh topology -- instances, connections, health |
register_instance | Register a new harness instance in the mesh |
Concern Tools (tools/concerns.py)
Cross-cutting concern queries.
| Tool | Description |
|---|---|
query_by_concern | Retrieve knowledge, rules, and workflows tagged with a specific concern |
tag_concern | Add concern tags to existing knowledge items |
Tracking Tools (tools/tracking.py)
Session and event observability -- query what happened across Claude Code and agent sessions.
| Tool | Description |
|---|---|
get_claude_sessions | List sessions, optionally filtered by project slug or status |
get_claude_session_detail | Full detail for a specific session including event count |
get_claude_session_events | Events for a session (tool calls, decisions, file writes) |
get_claude_activity_summary | Aggregate stats: sessions, events, tool usage over a time range |
track_artifact | Record an artifact produced during a session |
Logging Interceptor (tools/logging.py)
Not a tool category -- this is the wrapper that makes auto-tracking work. Every tool handler is wrapped with log_tool_call() which provides:
- Auto-session creation on first tool call per connection
- Event recording for every tool invocation (params, duration, status, result count)
- Tool call counting on the session row
- Structured JSON logging to stderr for external observability
Guardrails (tools/guardrails.py)
Post-tool hooks that fire after specific tool calls. These automatically emit semantic events (e.g., log_decision triggers a decision event) without the agent needing to do anything extra.
Database Schema
The CNS schema is the core data model. Every harness instance uses the same tables.
Core Knowledge Tables
SQL-- Knowledge store (the cortex)
CREATE TABLE cortex_chunks (
id UUID PRIMARY KEY,
domain TEXT,
content TEXT,
embedding VECTOR(1536),
tags TEXT[],
project_slug TEXT,
chunk_type TEXT,
concerns TEXT[] DEFAULT '{}',
created_at TIMESTAMPTZ
);
-- Rules engine (the spine)
CREATE TABLE spine_rules (
id UUID PRIMARY KEY,
slug TEXT UNIQUE,
content TEXT,
triggers TEXT[],
project_slug TEXT,
conditions JSONB,
concerns TEXT[] DEFAULT '{}',
created_at TIMESTAMPTZ
);
-- Process workflows (nervous system)
CREATE TABLE spine_workflows (
id UUID PRIMARY KEY,
slug TEXT UNIQUE,
steps JSONB,
triggers TEXT[],
project_slug TEXT,
status TEXT,
concerns TEXT[] DEFAULT '{}',
created_at TIMESTAMPTZ
);
-- Accumulated insights (memory)
CREATE TABLE learnings (
id UUID PRIMARY KEY,
category TEXT,
insight TEXT,
context JSONB,
domain TEXT,
project_slug TEXT,
transferability_score NUMERIC(3,2),
created_at TIMESTAMPTZ
);
Mesh Observability Tables
SQL-- Event stream
CREATE TABLE mesh_events (
id UUID PRIMARY KEY,
event_type TEXT,
harness_id TEXT,
payload JSONB,
created_at TIMESTAMPTZ
);
-- Cross-harness operations
CREATE TABLE mesh_transactions (
id UUID PRIMARY KEY,
steps JSONB,
total_duration_ms INTEGER,
harness_ids TEXT[],
status TEXT,
created_at TIMESTAMPTZ
);
Project and Session Tables
SQL-- Project registry
projects (id, slug, name, mode, status, ...)
-- Project state snapshots
project_states (id, project_slug, summary, phase, in_flight JSONB, blockers JSONB, ...)
-- Session lifecycle
sessions (id, project_slug, phase_id, input_tokens, output_tokens, cost, duration, output_lines JSONB, ...)
-- Decision log
decisions (id, project_slug, decision, rationale, context, ...)
-- Roadmap items
roadmap_items (id, project_slug, title, status, priority, ...)
Running the Server
Prerequisites
- Python 3.11+
- PostgreSQL with pgvector extension (Neon recommended)
- A
.envfile withDATABASE_URL
Setup
Bash# Install dependencies
pip install -e .
# Create .env from example
cp .env.example .env
# Edit .env with your DATABASE_URL
# Run migrations (if starting fresh)
# The CNS schema tables are created via Neon branch-from-parent
Running
Bash# Start the MCP server (stdio transport)
python server.py
The server uses stdio transport -- it reads MCP messages from stdin and writes responses to stdout. It is designed to be spawned by an MCP client (Claude Code, a mesh manager, etc.), not run standalone.
Environment Variables
| Variable | Required | Description |
|---|---|---|
DATABASE_URL | Yes | PostgreSQL connection string (Neon branch URL) |
PROJECT_SLUG | No | When set, scopes reads to this project (used by product harness instances) |
HARNESS_ID | No | Identifier for this harness instance (used in mesh events) |
Connecting an Agent to the Harness
How you connect depends on whether your harness is file-based (Scale 1) or MCP-based (Scale 2).
Scale 1: File-based harness
Point the agent's instruction file at the harness folder. The agent reads these files at session start and follows the rules inside them.
CLAUDE.md# Point to harness rules
@.claude/rules/coding-standards.md
@.claude/rules/testing.md
@.claude/rules/architecture.md
# Point to domain data
Domain data lives in domain/ — read before making changes.
Decisions are logged in docs/decisions/ — check before proposing alternatives.
For Cursor, use .cursorrules or .cursor/rules/. For Copilot, use .github/copilot-instructions.md. The pattern is the same — a file the agent reads automatically.
Scale 2: MCP-based harness — per-project
Add MCP server config to your project. Claude Code reads .mcp.json from the project root:
.mcp.json (in project root){
"mcpServers": {
"harness": {
"command": "python",
"args": ["/path/to/harness-os-mcp/server.py"],
"env": {
"DATABASE_URL": "env:HARNESS_DB_URL",
"PROJECT_SLUG": "my-project"
}
}
}
}
Each project gets its own .mcp.json with the PROJECT_SLUG that scopes rules, knowledge, and sessions to that project.
Scale 2: MCP-based harness — machine-wide
Install the harness once on the machine and every project gets it automatically. No per-project config needed.
~/.claude/settings.json (machine-level){
"mcpServers": {
"build-harness": {
"command": "python",
"args": ["/opt/harness-os-mcp/server.py"],
"env": {
"DATABASE_URL": "env:BUILD_HARNESS_DB",
"HARNESS_ID": "build-harness"
}
}
}
}
Machine-wide config means any Claude Code session on this machine connects to the build harness — no setup per project. This is the recommended approach when one harness serves all your projects (e.g., a build harness with shared coding standards).
Per-project when each project has its own harness instance (e.g., product harness with project-specific roadmap). Machine-wide when the harness is shared (e.g., build harness with coding standards, or a personal harness that spans all work). You can combine both — machine-wide for shared harnesses, per-project for project-specific ones.
Environment variables
Use env: references instead of plaintext credentials. The MCP client reads the actual value from the user's shell environment:
JSON// env: prefix → resolved from shell environment at spawn time
"DATABASE_URL": "env:BUILD_HARNESS_DB"
// Set in ~/.zshrc or ~/.bashrc
// export BUILD_HARNESS_DB="postgresql://..."
Connection Management
In a mesh with multiple harness instances, a mesh manager (e.g., harness-mesh.ts) spawns Python MCP server processes on demand:
- Lazy connect: Instance process spawned on first access
- 30s timeout: Connection timeout for unresponsive instances
- 10min idle eviction: Unused connections cleaned up
- Stale retry: Dead client evicted, reconnect attempted once
- Graceful shutdown: Clean shutdown on SIGTERM/SIGINT
These are config choices -- a different config might use persistent connections or different timeouts.
Testing
Bash# Run tests
pytest tests/
# Tests use a dedicated Neon 'test' branch