The Three Layers

harness.os is not one thing. It is three things stacked on top of each other: a methodology (universal principles for structuring AI knowledge), a config (a specific application of those principles), and a mesh (a running instance of connected harnesses following a config). Understanding these three layers is the key to understanding everything else.

METHODOLOGY → CONFIG → MESH TYPED HARNESSES SESSION LIFECYCLE KNOWLEDGE FLOW CNS SCHEMA RULES ENGINE MESH PROTOCOL harness.os — THE METHODOLOGY 8-Phase Dev Workflow TDD + Hexagonal 6-App Ecosystem FF-Gated Prototypes Per-Domain Branches Neon PostgreSQL HARNESS CONFIG — Marco's Choices MESH — Running Instance HUB build.ai cortex.ai marco.ai way2fly way2move way2save Methodology (universal principles) Config (your specific choices) Mesh (running connected instance)

The Methodology

Universal principles for structuring AI knowledge. Four harness types, a standard schema, a session lifecycle, and a protocol for knowledge to flow between contexts. Anyone can apply these principles to build anything.

The Config

A specific application of the methodology. Marco's config includes an 8-phase dev workflow, TDD, hexagonal architecture, Neon PostgreSQL, and 6 apps across fitness, skydiving, finance, and SaaS. Configs are portable and forkable.

The Mesh

A running instance of connected harnesses. Marco's mesh: 6 apps, 18 harness instances, 10 Neon branches, all talking through MCP. A cortex.ai tenant gets their own mesh. One config can spawn many meshes.

Technical deep dive: The three layers in practice

1. harness.os = The Methodology

harness.os is a set of universal principles for structuring AI knowledge. It defines a type system of four harness types (build, product, operations, domain), an internal schema for each harness (knowledge tables, learning tables, rules, workflows), a session lifecycle model, and a protocol for flowing knowledge between contexts through a mesh. The CNS schema -- how tables relate, how slugs scope data, how rules fire through triggers -- is the core intellectual property. This layer exists independently of any specific app or implementation.

Architecture principle

The methodology is an interface contract. Any implementation that follows the harness type system, uses the CNS schema (cortex_chunks, spine_rules, spine_workflows, learnings), and respects the session lifecycle is a valid harness.os implementation. The MCP tools, the Neon branches, the Python server -- those are implementation details of one config. The methodology is the specification.

2. Harness Config = A Specific Application

A harness config is a concrete set of choices that apply the methodology. Marco's config includes: an 8-phase development workflow, TDD and hexagonal architecture as build standards, FF-gated prototypes, Neon PostgreSQL as the data layer, MCP as the communication protocol, and specific operations harnesses for skydiving, fitness, and finance. Configs are portable and forkable -- someone else could use the same methodology with entirely different choices (different tech stack, different domains, different workflow).

3. Mesh = A Running Instance

A mesh is a running instance of connected harnesses following a config. It has the class-to-instance relationship with config that objects have with classes. Marco's personal mesh connects 6 apps (build.ai, marco.ai, cortex.ai, way2fly, way2move, way2save) through 18 harness instances across 10 Neon branches. A cortex.ai Lake Deck mesh would be a different instance: hospitality processes connected through their harnesses. One config can spawn multiple meshes. Apps are on a mesh, not the mesh.

The class:instance pattern

harness.os (methodology) defines the abstract class. marco-config (config) is a concrete class with specific field values. marco-mesh (mesh) is a running instance of that class. lake-deck-mesh would be another instance of a cortex-hospitality-config. Config defines what harnesses exist and how they connect. Mesh is the actual running system with data flowing through it.

Per-user mesh instances

A mesh instance is per user. Each user gets a mesh that connects only the harnesses relevant to their subscriptions. Consider a way2do consumer (way2do is the consumer hub that bundles way2fly, way2move, and way2save):

  • A user subscribed to way2save + way2fly gets a mesh instance connecting their finance domain harness to their skydive domain harness. Their agent can answer "Can I afford this skydive camp?" because the mesh links budget data to skydive scheduling -- for that specific user.
  • They don't see way2move. Their mesh has no fitness harness. The cross-domain reasoning only spans the harnesses in their mesh.
  • If they later add way2move, their mesh instance expands -- now the agent can factor training schedule into the camp decision.

Marco's mesh is the superset: all 6 apps, all 18 harness instances. Each consumer gets a subset. The mesh instance sits inside the configuration (which apps connect), which sits inside the methodology (how harnesses connect).

Data architecture deep dive: Schema, topology, and state

The three-layer model maps directly to a data architecture:

1. Methodology = Schema Definition

The harness.os methodology defines the schema contract: every harness instance, regardless of type or purpose, uses the same core tables -- cortex_chunks (knowledge + VECTOR(1536) embeddings), spine_rules (triggers[] + conditions), spine_workflows (steps JSONB), learnings (accumulated insights). This universal schema is what makes harnesses composable -- any MCP tool works against any harness instance because the tables are identical.

2. Config = Branch Topology + Isolation Strategy

Marco's config specifies how the schema maps to physical databases: 10 Neon branches, slug-filtered vs. branch-isolated harnesses, which instances share branches (product-shared) vs. which get dedicated branches (skydive-harness). A different config might use MySQL, or DynamoDB, or even flat files -- as long as the harness schema contract is honored.

3. Mesh = Live Data Topology

The mesh is the running data system: 18 harness instances connected via MCP, with queries flowing between branches, learnings accumulating in each instance, and mesh_transactions tracking cross-harness operations. The mesh is where data lives and flows. A different mesh (e.g., Lake Deck) would have different data but the same schema and query patterns.

Data architecture insight

The three layers create a clean separation: schema (methodology, universal), topology (config, per-deployment), and state (mesh, per-instance). This means you can reason about query patterns at the methodology level, optimize storage at the config level, and monitor performance at the mesh level -- independently.

Strategic deep dive: Three levels of business value

The three layers represent three levels of business value:

1. Methodology = The Playbook (Intellectual Property)

The harness.os methodology is the core IP. It defines how to organize AI knowledge into four composable layers that anyone can apply to build any product. This is what you teach, license, or open-source. It is technology-agnostic and domain-agnostic.

2. Config = Your Strategy (Specific Choices)

Marco's config is a specific business strategy expressed through the methodology: 6 apps across fitness, movement, finance, personal management, B2B SaaS, and developer tools. A different founder applying the same methodology would build a different config for a different market. Configs are the "business plan" layer -- portable and forkable.

3. Mesh = Your Running Business (Live System)

The mesh is the actual deployed system with real users, real data, and real revenue. Marco's mesh serves his products. Lake Deck's mesh (a cortex.ai tenant) serves their hospitality operations. Each mesh is independent but can share learnings with other meshes through the methodology's knowledge flow patterns.

1
Methodology
1
Active Config
6
Apps on Mesh
18
Harness Instances
Strategic framing

The methodology creates products (configs). The products create partnerships (meshes). The partnerships create ecosystems. A cortex.ai tenant is someone applying a subset of the methodology (operations + domain config) to their own mesh. They don't need to understand build harnesses or product harnesses -- they only see the config relevant to their use case.

📖

Think of it like cooking. The methodology is the science of cooking itself -- how flavors combine, how heat changes ingredients, what makes a dish balanced. The config is a specific cookbook -- Italian, Japanese, or fusion -- that applies those principles to create recipes. The mesh is an actual restaurant using that cookbook, with real kitchens, real chefs, and real diners.

The Three Layers, Simply

The Methodology = The Playbook

Universal rules for organizing AI knowledge. "Knowledge should be separated into four types. Types should be composable. Knowledge should flow between connected systems." This works for any industry, any technology, any product.

The Config = Your Version

Marco's specific choices: 6 apps (skydiving, fitness, finance, personal, business, developer tools), an 8-step development process, and specific tools (Neon database, MCP protocol). Someone else would make different choices for their industry.

The Mesh = The Running System

The actual live system with all the apps connected and talking to each other. When you ask "Can I afford skydive camp?", it's the mesh that routes the question to the right apps and combines the answers.

The power of this separation: anyone can take the playbook, write their own version, and run their own system. A hotel chain could take the same playbook and create a hospitality version with staff training, compliance, and scheduling -- a completely different system, built on the same principles.

And it works at any budget. A student can start with just text files in their project folder (free). A freelancer can add a database ($20/month). A team can go remote ($200/month). An enterprise can run a federated mesh ($2K+/month). The playbook is the same at every level -- only the tools change.

The Methodology

harness.os is not a product. It is a set of principles for organizing AI knowledge that anyone can apply at any scale. The principles: knowledge should be typed, structured with defined internal organization, and should flow between contexts through a mesh. These principles are universal -- they work for skydiving, for hospitality, for manufacturing, for any domain. And they work at every scale -- from a solo developer with CLAUDE.md files ($0) to a federated enterprise mesh ($2K+/mo). The methodology stays the same. Only the implementation changes.

Technical deep dive: The five core principles

The harness.os methodology defines five core principles that any implementation must follow:

Principle 1: Typed Harnesses

All knowledge falls into exactly four types: build (HOW to create), product (WHY + WHAT — discovery, validation, specification, measurement), operations (HOW the domain works), and domain (WHO — per-user data). Every harness instance has exactly one base type. Products are compositions of these types.

At any scale: A solo dev with files has one build harness (CLAUDE.md) and one product harness (docs/). A team has a shared build harness (coding standards DB) + per-project product harnesses. An enterprise adds operations harnesses per department and domain harnesses per user. The four types hold at every tier.

Principle 2: Internal Structure (The CNS Schema)

Every harness instance, regardless of type, contains the same internal structure:

TablePurposeKey Columns
cortex_chunksKnowledge storedomain, content, embedding VECTOR(1536), tags[], project_slug
spine_rulesRules engineslug, content, triggers[], project_slug, conditions
spine_workflowsProcess workflowsslug, steps JSONB, triggers[], project_slug
learningsAccumulated insightscategory, insight, context, domain, transferability_score

The "CNS" metaphor: cortex_chunks is the brain (knowledge storage), spine_rules is the spine (structural rules that trigger actions), spine_workflows is the nervous system (multi-step processes), and learnings is the memory (accumulated experience).

Principle 3: Session Lifecycle

Every interaction with a harness follows a lifecycle: start_session() loads the handoff from the previous session plus accumulated rules, the agent works within the harness context, and end_session() persists decisions and learnings back to the harness. This lifecycle is what makes knowledge compound over time.

At any scale: At Tier 1 (files), the "session lifecycle" is reading CLAUDE.md at start and updating docs before you close the editor. At Tier 2 (DB), it's explicit start_session()/end_session() MCP calls. At Tier 4 (enterprise), it's automated with audit trails. Same principle, different implementation.

Principle 4: Slug Scoping

The project_slug field on every table is the scoping mechanism. Multiple harness instances can share a physical database by filtering on slug. This enables both branch-level isolation (each instance gets its own database) and slug-level isolation (instances share a database but see only their own data). The methodology supports both -- the config decides which to use.

At any scale: At Tier 1, scoping is folder structure (project-a/CLAUDE.md vs project-b/CLAUDE.md). At Tier 2, it's slug-filtered rows in shared tables. At Tier 3+, it can be separate databases per tenant. The scoping principle adapts to the storage layer.

Principle 5: Mesh Communication

Harnesses communicate through a defined protocol (in the current config, MCP). Cross-harness queries are logged in mesh_transactions with step tracking. Learnings with high transferability_score can flow between harnesses. The mesh protocol is what makes cross-domain reasoning possible.

At any scale: At Tier 1, the "mesh" is a developer manually copying a learning from one project's docs to another. At Tier 2, it's MCP servers connecting harness databases locally. At Tier 3, it's remote MCP with auth. At Tier 4, it's federated mesh with cross-organization knowledge flow. The principle of knowledge flowing between contexts is the same -- the mechanism scales.

Why this matters

These five principles are implementation-agnostic. You could implement harness.os with PostgreSQL + MCP (as the current config does), or with MongoDB + REST APIs, or with DynamoDB + gRPC. The methodology defines the contract. The config implements it. The mesh runs it.

Data deep dive: The universal schema contract

The methodology defines a data contract that every harness implementation must honor:

The Universal Schema

CNS Schema Contract
-- Every harness instance has these tables:
cortex_chunks  (knowledge + embeddings)
  domain TEXT, content TEXT, embedding VECTOR(1536),
  tags TEXT[], project_slug TEXT, chunk_type TEXT

spine_rules  (trigger-based rules)
  slug TEXT UNIQUE, content TEXT, triggers TEXT[],
  project_slug TEXT, conditions JSONB

spine_workflows  (multi-step procedures)
  slug TEXT UNIQUE, steps JSONB, triggers TEXT[],
  project_slug TEXT, status TEXT

learnings  (accumulated insights)
  category TEXT, insight TEXT, context JSONB,
  domain TEXT, project_slug TEXT,
  transferability_score NUMERIC(3,2)

-- Mesh observability:
mesh_events       (event_type, harness_id, payload JSONB)
mesh_transactions (steps JSONB, total_duration_ms)

Scoping Patterns

The methodology defines two isolation strategies that configs can mix:

  • Branch isolation: Each harness instance gets its own physical database/branch. Maximum isolation. Higher infrastructure cost.
  • Slug isolation: Multiple harness instances share a database, scoped by WHERE project_slug = $N. Lower cost. Application-layer isolation.
Schema inheritance

When a new harness instance is provisioned (e.g., Neon branch-from-parent), it inherits the full schema automatically. Zero DDL needed. The schema IS the methodology -- tables are identical everywhere, and that's what makes any MCP tool work against any harness instance. This is intentional: the data contract is the interface contract.

Strategic deep dive: The IP and licensing model

The methodology is the intellectual property layer -- the principles that make the entire platform work. Here is what it defines:

Four Knowledge Types

All organizational knowledge falls into exactly four categories: how to create (build), what to build (product), how to run domain operations (operations), and actual user data (domain). This is universal -- it works for software companies, hotels, factories, or any other organization.

Composability

Any combination of the four types creates a valid product. Build + Product = a developer platform. Operations + Domain = a B2B SaaS tool. All four = a full operating system. This means new products are new combinations, not new platforms.

Compound Learning

Every interaction with the system generates learnings that persist in the harness. New agents, new products, and new users inherit accumulated knowledge. The system gets smarter with every use -- and learnings that generalize (high transferability) flow across domains automatically.

Business implication

The methodology is what you teach, license, or open-source. A partner applying the methodology to their industry creates their own config and mesh -- but the principles are shared. This is the franchise model: the methodology is the franchise playbook, configs are franchise locations, and meshes are the running businesses.

🧠

The methodology is like the rules of language. Every language has nouns, verbs, adjectives, and sentences. Those rules are universal -- they work for English, Spanish, Mandarin, or any language. The harness.os methodology works the same way: four types of knowledge, rules for how they connect, and a way for them to learn over time. The specific language you speak (your config) is up to you.

The Five Rules

  1. Sort knowledge into four types: Recipes (how to create things), blueprints (what to build), playbooks (how to run domain operations), and records (your data).
  2. Give each type its own filing cabinet: Keep them separate so they don't get mixed up, but organized so they can be found quickly.
  3. Start each conversation with context: When you talk to the system, it loads what it learned from last time. It never starts from scratch.
  4. Label everything: Each piece of knowledge has a tag that says which project, which domain, and which type it belongs to. This is how the system knows what to show whom.
  5. Let knowledge flow: When something learned in one app is useful in another, it flows automatically. A training insight from your fitness app can help a hotel's staff training system.

Harness Types

The methodology defines four knowledge categories. Your config decides how many instances of each type exist. Your mesh connects them for a specific user. All four types talk to each other through the mesh — build knowledge flows into products, products reference operations, operations govern domain data.

METHODOLOGY (harness.os — universal) BUILD HOW to create things Recipes (create) PRODUCT WHY + WHAT Blueprints OPERATIONS HOW to run a domain Playbooks DOMAIN Per-user data Records CROSS-CUTTING CONCERNS: relational · governance · causal/WHY · metacognitive · security/risk span all four types — tagged via concerns[] on knowledge tables Works at every scale: files ($0) → DB+MCP ($20) → remote ($200) → federated ($2K+) Same types, same concerns, same principles. Only the implementation changes. CONFIG (marco-config — one implementation) build 1 instance w2f product w2m product w2s product cortex product skydive ops fitness ops finance ops per-user domains scoped by user_id 1 build + 4 product + 3 operations + N domain = 18 harness instances on Neon (Tier 2) A different config (e.g., hotel chain) would have different instances but the same four types MESH INSTANCE (per user — running connections) Marco's mesh: ALL harnesses connected build ↔ w2f ↔ w2m ↔ w2s ↔ cortex ↔ skydive ↔ fitness ↔ finance ↔ domains way2do user (way2save + way2fly only) finance-ops ↔ skydive-ops ↔ user domains — "Can I afford this camp?" Each user gets a mesh subset matching their subscriptions. The mesh is the per-user runtime.
Architecture deep dive: Four harness types as components

The methodology defines four harness types. Each type has the same internal schema (CNS tables) but different content and access patterns. In any config, each type maps to one or more harness instances.

Build Harness

HOW to create things

Creation knowledge — not just software. Covers software development (dev workflow, CI/CD, architecture patterns, testing strategies) AND content creation (blog, marketing, design systems). Shared across all projects in a mesh. Through the mesh, build harness knowledge flows into every product: the same coding standards apply whether you're building a finance app or a skydiving app.

Tier 1: CLAUDE.md + .claude/rules/  |  Tier 2+: cortex_chunks + spine_rules in build-harness DB

Product Harness

WHY + WHAT

The full product lifecycle: discovery (validate ideas, hypotheses, experiments, A/B tests), specification (architecture decisions, roadmaps, specs, feature definitions), and measurement (metrics, impact tracking, user feedback). Each product in the mesh gets its own product harness (w2f-product, w2s-product, etc.) with a lifecycle phase: discovery → building → maintaining → archived. Through the mesh, a product harness pulls build knowledge (how to create) and pushes requirements to operations harnesses (how to run).

Tier 1: docs/ folder per project  |  Tier 2+: per-product cortex_chunks + spine_workflows

Operations Harness

HOW to run a domain

Domain-specific operational knowledge: skydive progression rules, fitness programming, finance budgets, hospitality procedures. The only type that changes completely by industry. Through the mesh, operations harnesses connect to domain harnesses (the data they operate on) AND to build harnesses (the tools that maintain them) AND to product harnesses (the apps that expose them).

Tier 1: domain-specific docs  |  Tier 2+: spine_rules with triggers[] + spine_workflows

Domain Harness

WHO — per-user data

User-scoped transactional data: jump logs, workout records, financial transactions. Scoped by user_id. Through the mesh, domain harnesses connect to operations harnesses (the rules that govern them) and to other domain harnesses — enabling cross-domain queries like "Can I afford this skydive camp?" which crosses finance domain ↔ skydive domain for a specific user's mesh instance.

Tier 1: app data (Firebase, local DB)  |  Tier 2+: fed_* tables + learnings per user

All four types connect through the mesh

The mesh is not a layer between two types — it connects all four. marco.ai talks to build harness (for dev standards), product harnesses (for app specs), operations harnesses (for life management rules), and domain harnesses (for personal data). A way2do consumer with way2save + way2fly gets a mesh connecting just those products' operations and domain harnesses. The type system defines what knowledge is. The mesh defines how it flows. The config decides which types exist in your mesh. The methodology makes all of this work at any scale.

Data deep dive: Per-type data topology

Each harness type has the same schema but different data access patterns:

TypePrimary Table UseTypical Query PatternData Volume
Buildspine_rules (coding standards), cortex_chunks (patterns)Read-heavy: agents query patterns before generating codeLow (hundreds of rows)
Productcortex_chunks (specs, ADRs, hypotheses), spine_workflows (discovery + delivery phases), learnings (validation results)Read-write: agents query specs, log experiment results, track metricsLow-medium per product (grows with validation data)
Operationsspine_rules (triggers[]), spine_workflows (operational procedures)Trigger-driven: rules fire based on events (session_complete, threshold_met)Medium (rules + accumulated learnings)
Domainfed_* tables (user data), learnings (personal insights)OLTP: reads and writes of user data scoped by user_idHigh (grows with user activity)
Isolation strategy per type

The methodology recommends: build and product harnesses tolerate slug-based isolation (shared branches). Operations harnesses benefit from branch isolation (complex trigger logic, domain-specific rules). Domain harnesses require the strongest isolation (user data, privacy). The config decides where on this spectrum each instance falls.

Strategic deep dive: Four types as product building blocks

Understanding the four types is the key to understanding every product in the ecosystem -- regardless of config.

📖

Build Harness

How to build things

Institutional memory of how software gets built. Shared across every engineering project in any config that includes software development.

🏗

Product Harness

WHY + WHAT

The full product lifecycle: discovery and validation (why to build), specification (what to build), and measurement (did it work). Each product has a lifecycle phase: discovery → building → maintaining → archived.

📋

Operations Harness

How to run domain operations

Domain-specific operational knowledge. The brain that makes each domain smart about its specific operations. This is the type that cortex.ai packages for B2B.

💾

Domain Harness

WHO — user data

Per-user, per-app operational data. Each user's experience is personal because their domain harness contains their specific data.

Strategic advantage

The four-type model means any product can be defined as a composition of types. Build + Product = developer platform. Operations + Domain = B2B SaaS. All four = full operating system. New products don't require new platforms -- they require new compositions. This is true in any config, not just Marco's.

📚

Everything in the system is built from four types of knowledge:

🍳

Recipes

Build Harness

How to make things. Like a cookbook that grows over time -- every time a software project is built, the recipes get better. "Use this pattern for user interfaces." "Test this way."

📐

Blueprints

Product Harness

WHY and WHAT for each app. Like a product lab: first you test if an idea works (discovery), then you draw the plans (specification), then you track if it's working (measurement). Each app goes through phases: discovery → building → maintaining → archived.

📓

Playbooks

Operations Harness

How to run each domain. A skydiving playbook knows progression rules and safety checklists. A fitness playbook knows exercise programming. A finance playbook knows budget categories.

📁

Records

Domain Harness

Your actual data. Jump logs, workout history, bank transactions. This is personal -- it belongs to you and is what makes the apps useful for your specific life.

Each type of knowledge is stored separately and securely, but the system can look across all of them when you ask a question. That's the magic -- one question can pull answers from multiple types of knowledge at once.

Type System: Cross-Cutting Concerns

The four harness types (build, product, operations, domain) have held up across 18 harness instances and 6 apps. Stress-testing them against knowledge management frameworks (Porter, TOGAF, Intellectual Capital, Zack's taxonomy) revealed knowledge that doesn't fit neatly into one type. The resolution: five cross-cutting concerns that span all four types. These are not new types -- they are lenses that apply everywhere, implemented as a concerns TEXT[] column on knowledge tables.

What fits cleanly today

Knowledge Example Type Why It Fits
TDD workflow, CI/CD pipeline, content creation processBuildHow to create things — recipes for making digital things
App roadmap, feature specs, architecture decisionsProductWhat to build — blueprints for each product
Skydive progression rules, hospitality onboarding, fitness programmingOperationsHow to run domain ops — playbooks for each industry
User jump logs, workout records, financial transactionsDomainPer-user data — personal records that make apps useful

The five cross-cutting concerns

Each concern cuts across all four harness types. A single knowledge chunk, rule, or workflow can be tagged with one or more concerns via a concerns TEXT[] column on cortex_chunks, spine_rules, and spine_workflows. Agents query by concern to assemble cross-cutting context regardless of which harness type owns the data.

Relational / Ecosystem

Stakeholders, partners, dependencies, network knowledge. Who are the upstream and downstream entities? How do relationships affect outcomes? Partner ecosystem health, supplier networks, community dynamics, coach-student relationships, DZ partnerships, cortex.ai tenant relationships.

Spans: Build (open-source dependencies), Product (partner integrations), Operations (vendor relationships), Domain (user relationship data).

Governance

Compliance, policies, access control, audit trails. Who approves what? What requires review? What's regulatorily mandated? Governance governs all four types -- it is a layer above, not alongside them.

Spans: Build (CI/CD approval flows), Product (decision authority), Operations (compliance procedures), Domain (data access policies).

Causal / WHY

Root cause analysis, decision rationale, failure chains. While Product now includes discovery and validation (the WHY of what to build), this concern captures deeper causal reasoning across ALL types. Why a market shifted, why a regulation exists, why users churn, why an architecture was chosen, why a process failed.

Spans: Build (architecture decision rationale), Product (strategic reasoning), Operations (failure post-mortems), Domain (user behavior patterns).

Metacognitive

Learning-about-learning, process effectiveness, knowledge quality. How the system improves its own knowledge practices. Which curation approaches work? Which learnings transfer well? The transferability_score is already a metacognitive signal.

Spans: Build (which dev practices improve quality), Product (which specs lead to good outcomes), Operations (which rules fire effectively), Domain (which data improves predictions).

Security / Risk

Threat models, vulnerability tracking, risk assessments. Applies to all four types equally -- secure Build practices, Product security requirements, Operations compliance, Domain data protection.

Spans: Build (secure coding rules), Product (security architecture), Operations (compliance risk), Domain (data protection, PII handling).

Implementation: concerns as tags, not types

Cross-cutting concerns are implemented as a concerns TEXT[] column on cortex_chunks, spine_rules, and spine_workflows. A chunk tagged concerns = ['governance', 'security'] is discoverable by any agent querying either concern, regardless of which harness type stores it. This avoids the combinatorial explosion of creating separate harness types for each concern while ensuring nothing falls through the cracks.

Schema Addition
-- Added to existing CNS knowledge tables:
ALTER TABLE cortex_chunks ADD COLUMN concerns TEXT[] DEFAULT '{}';
ALTER TABLE spine_rules ADD COLUMN concerns TEXT[] DEFAULT '{}';
ALTER TABLE spine_workflows ADD COLUMN concerns TEXT[] DEFAULT '{}';

-- Query by concern (cross-type):
SELECT * FROM cortex_chunks WHERE 'governance' = ANY(concerns);

-- Tag a chunk with multiple concerns:
INSERT INTO cortex_chunks (domain, content, concerns)
VALUES ('compliance', 'All document changes require lawyer review...',
        ARRAY['governance', 'security']);
AI Knowledge Engineering

Cross-cutting concerns structure knowledge not just by type but by concern, so AI agents can reason across dimensions. The four harness types organize knowledge by what it is (creation, specification, operation, data). The five concerns organize knowledge by what it addresses (relationships, governance, causation, learning, risk). Together, they form a two-dimensional knowledge classification that gives agents richer context for cross-domain reasoning.

When cross-cutting concerns matter most

Scenario Concern How the agent uses it
Apply at legal tech companyGovernanceAgent queries WHERE 'governance' = ANY(concerns) to assemble compliance context across all harness types before generating legal documents.
cortex.ai grows past 5 tenantsRelational/EcosystemAgent queries ecosystem concern to understand tenant relationships, partner networks, and cross-tenant patterns when making provisioning decisions.
Second person uses the methodologyMetacognitiveAgent queries metacognitive concern to surface knowledge curation best practices and which learning approaches have been most effective.
Compliance audit of any productSecurity/Risk + GovernanceAgent queries both concerns simultaneously to assemble a cross-type audit trail: who approved what, what security controls are in place.
Strategic decisions need justificationCausal/WHYAgent queries causal concern to retrieve decision rationale, root cause analyses, and reasoning chains beyond what the Product spec captures.

The verdict

The four types are correct at this level of abstraction. They've held across 18 instances and 6 apps without needing a fifth type. The five cross-cutting concerns handle what initially looked like gaps: governance and relational/ecosystem knowledge are not new types but concerns that span all types. Tagging knowledge with concerns TEXT[] gives agents a second dimension for retrieval without complicating the type system.

This two-dimensional model (4 types x 5 concerns) organizes knowledge so AI agents can reason effectively across both type boundaries and concern boundaries. The type tells you where knowledge lives. The concern tells you what dimension it addresses.

What is a Harness Config

The methodology is universal. A config is what makes it yours. It is the set of concrete choices -- which tech stack, which domains, which workflows, which products -- that turn the abstract methodology into a real system. Configs are portable and forkable: someone else could take the same methodology and build an entirely different config for an entirely different industry.

Technical deep dive: Config as implementation spec

A harness config specifies:

  • Data layer: Which database system, branch/isolation strategy, connection management.
  • Communication protocol: How harness instances talk to each other (MCP, REST, gRPC, etc.).
  • Build harness contents: Which development workflow, testing standards, architecture patterns.
  • Product harnesses: Which products, per-product specs, roadmaps.
  • Operations harnesses: Which domains, domain-specific rules and workflows.
  • Domain harnesses: Per-user data schema, data ownership model.
  • Mesh topology: How harness instances connect, which products compose which types.

A config is like a docker-compose.yml for knowledge architecture: it defines all the services (harness instances), their connections, and their configuration. The methodology is the base image; the config is the composition file.

Portability

Because the methodology defines the schema contract, configs are portable. You could export a operations harness from one config (e.g., a skydive progression harness) and import it into another config that needs it. The tables are the same; only the content differs. This is what makes cortex.ai possible -- each tenant effectively runs their own mini-config within the broader platform config.

Data deep dive: Branch strategy and isolation

A config maps the abstract methodology to a concrete data topology:

Methodology ConceptConfig DecidesExample (Marco's Config)
Data layerDatabase system, branching modelNeon PostgreSQL, 10 branches, copy-on-write
Knowledge tablesVector dimension, index typeVECTOR(1536), pgvector HNSW index
Instance isolationBranch vs. slug per typeOperations: branch-isolated. Product: slug-filtered on shared branch
Mesh communicationProtocol, connection managementMCP (Python), lazy connect, 30s timeout, 10min idle eviction
ObservabilityEvent schema, metrics pipelinemesh_events + mesh_transactions tables
Config as data architecture document

The config IS the data architecture document. It specifies branch topology, isolation strategy, connection patterns, and scaling model. A data engineer joining the team reads the config to understand the entire data layout. A new config for a different use case would specify a different topology but the same schema -- ensuring tooling compatibility.

Strategic deep dive: Config as business strategy

A config translates the methodology into a business strategy:

Build Config

Which development practices to encode. TDD? Hexagonal architecture? 8-phase workflow? These are choices that shape how all products in the config get built.

Product Config

Which products to build and how they compose harness types. This is the product portfolio strategy expressed as a composition matrix.

Operations Config

Which operational domains to support. Skydiving? Hospitality? Manufacturing? Each domain gets its own operations harness with domain-specific rules.

Domain Config

Per-user data model. What data each app collects, how it's stored, and how it flows between products through the mesh.

Forkability

Configs are forkable. A partner could fork Marco's config, remove the domains irrelevant to them (skydiving, fitness), add their own domains (healthcare, logistics), and have a working platform in weeks instead of months. The methodology stays the same; the config adapts to the business.

📖

The methodology is the rulebook. The config is your game plan. Two football teams play by the same rules (the methodology), but each team has its own playbook (the config) -- different formations, different strategies, different strengths. Marco's config is his playbook: 6 apps, skydiving + fitness + finance domains, specific development practices. Someone else would write a completely different playbook for their business.

The important thing about a config is that it's shareable and adaptable. If someone likes how Marco organized his fitness domain, they could copy that part and adapt it for their own use -- while keeping everything else different. It's like sharing a recipe from one cookbook to another.

Marco's Config

This section shows one specific application of the methodology. Marco's config includes: an 8-phase development workflow, TDD and hexagonal architecture, FF-gated prototypes, 6 apps across 5 domains, and Neon PostgreSQL as the data layer. Every choice here is a config decision -- someone else applying harness.os would make different choices for their context.

Technical deep dive: Architecture choices and stack

Build Harness Config

Marco's build harness encodes specific development practices:

  • 8-phase dev workflow: Discovery, Design, Specs, Domain, UI Wiring, Adapters, E2E, Deploy
  • Model routing: Opus for Discovery/Specs/Planning, Sonnet for Domain/Wiring/Adapters/E2E, Haiku for Deploy/verification
  • TDD-first: Write failing test first, then implementation
  • Hexagonal architecture: domain/ and ports/ must have zero framework imports
  • FF-gated prototypes: New features behind feature flags until validated

Instances: build-harness (slug-filtered: build + soft-eng + content), marco-builder (dedicated branch)

Product Harness Config

6 product harnesses, all sharing one Neon branch (product-shared), isolated by project_slug:

Instances: w2f-product, w2m-product, w2s-product, cortex-product, marco-ai-product, build-ai-product

Operations Harness Config

5 operations harnesses, each on its own dedicated Neon branch:

Branches: skydive-harness, fitness-harness, finance-harness, life-management, marco-personal

Domain Harness Config

Currently federated in main DB via fed_* tables. 5 domain harnesses planned, migrating to per-app Neon branches.

Current tables: fed_jumps, fed_activities, fed_transactions, fed_routines, fed_books

Infrastructure Config

Neon Branch Map (Marco's Config)
main -------------------------------- Platform DB (37 tables, request pipeline)
+-- build-harness -------------- Build + soft-eng + content (slug-filtered)
+-- marco-builder -------------- Personal dev workflow (own branch)
+-- marco-personal ------------- Cross-domain hub (own branch)
+-- skydive-harness ------------ Skydive operations rules (own branch)
+-- fitness-harness ------------ Training operations rules (own branch)
+-- finance-harness ------------ Finance operations rules (own branch)
+-- life-management ------------ Life routines/goals (own branch)
+-- product-shared ------------- All 6 product harnesses (slug-filtered)
+-- test ----------------------- Server integration tests
Connection management (config choice)

Each harness instance is registered in harness_instances with database_url, base_type, and optional tool_filter. The mesh manager (harness-mesh.ts) spawns Python MCP server processes on demand: lazy connect on first access, 30s timeout, 10min idle eviction, stale retry (evicts dead client, reconnects once), graceful shutdown on SIGTERM/SIGINT. These are config choices -- a different config might use persistent connections or a different timeout strategy.

Data deep dive: Branch topology and storage

Branch Budget (10/10 Free Tier)

Neon Branch Map
main -------------------------------- Platform DB (37 tables, request pipeline)
+-- build-harness -------------- Build + soft-eng + content (slug-filtered)
+-- marco-builder -------------- Personal dev workflow (own branch)
+-- marco-personal ------------- Cross-domain hub (own branch)
+-- skydive-harness ------------ Skydive operations rules (own branch)
+-- fitness-harness ------------ Training operations rules (own branch)
+-- finance-harness ------------ Finance operations rules (own branch)
+-- life-management ------------ Life routines/goals (own branch)
+-- product-shared ------------- All 6 product harnesses (slug-filtered)
+-- test ----------------------- Server integration tests

Per-Type Topology in This Config

TypeInstancesBranch StrategyIsolation
Build (2)build-harness, marco-builderDedicated branchesSlug-filtered / dedicated
Product (6)w2f, w2m, w2s, cortex, marco-ai, build-aiShared branch (product-shared)Slug-filtered: WHERE project_slug = $N
Operations (5)skydive, fitness, finance, life-mgmt, marco-personal5 dedicated branchesFull branch isolation
Domain (5 planned)w2f-domain, w2m-domain, w2s-domain, cortex per-tenant, marco-domainPlanned (currently fed_* in main)Per-app branches (planned)
Branch budget: 10/10 used

Free tier is maxed out. Future domain harnesses (w2f-domain, w2m-domain, w2s-domain, per-tenant cortex branches) require upgrading to Neon Pro. The test branch could be freed if CI migrates, recovering 1 slot. This is a config constraint, not a methodology constraint -- a different config could use a different database with no branch limits.

Strategic deep dive: The product portfolio

Marco's config creates a 6-app ecosystem from the four harness types. Each app is a specific composition:

6
Apps
18
Harness Instances
10
Neon Branches
37
Database Tables

Build Config

8-phase dev workflow, TDD, hexagonal architecture, FF-gated prototypes. These are Marco's specific engineering choices -- encoded in the build harness so every project follows them automatically.

Operations Config

5 domains: skydiving (progression, safety), fitness (training programming), finance (budgets, transactions), life management (routines, goals), and personal (cross-domain hub).

Config as strategy

The config IS the business strategy. Marco chose to build across fitness, finance, and skydiving because those are his domains of expertise. A different founder might configure the same methodology around healthcare, logistics, and education. The methodology doesn't care which domains you choose -- it provides the structure for any combination.

Marco's version of the harness.os playbook includes:

How he builds

An 8-step development process, writing tests before code, and protecting new features behind switches until they're ready. These are his recipes.

What he's building

6 apps: a skydiving app (way2fly), a fitness app (way2move), a finance app (way2save), a business tool (cortex.ai), a personal assistant (marco.ai), and the command center that builds everything else (build.ai).

How each domain works

Skydiving has progression rules and safety checklists. Fitness has training programs and exercise knowledge. Finance has budget categories and spending rules. Each domain has its own playbook.

His data

Jump logs, workout history, bank transactions, personal goals, and routines. This is Marco's data -- personal to him and stored separately from the rules and recipes.

🏭

Someone else using harness.os would write their own config. A hotel chain might configure: build harness (same development practices), operations harnesses (hospitality onboarding, compliance, scheduling), and domain harnesses (staff records, guest data). Different industry, same playbook, different game plan.

Products as Compositions

Every product in Marco's config is a specific composition of harness types. build.ai = Build + Product. cortex.ai = Operations + Domain. marco.ai = All four. New products don't require new platforms -- they require new compositions. And the factory pattern means any combination of the four types is a valid product.

PRODUCT-HARNESS COMPOSITION MAP BUILD PRODUCT OPERATIONS DOMAIN build.ai Developer Platform cortex.ai B2B SaaS way2do.ai B2C Consumer Hub marco.ai Everything (4 modes) AUDIENCE Developer B2B Companies B2C Consumers Owner (all)
Architecture deep dive: Product composition model

Product Abstraction Map

ProductHarness LayersStackAgent TypesStatus
build.aiBuild ProductReact 19 + Express + WSCode, review, content, design, researchBuilt
cortex.aiOperations DomainFlutter (Dart)Debrief, compliance, onboarding (per-tenant)Planned
way2do.aiDomainFlutter + WebCross-app assistant (subscription-gated)Planned
marco.aiBuild Product Operations DomainFlutter + WebCross-domain assistant, full harness controlPlanned
Composition as architecture

build.ai = Build + Product harness abstraction. cortex.ai = Operations + Domain for B2B. way2do.ai = Domain layer as subscription hub. marco.ai = All four layers, full mesh access. Each product is a different slice of the same config, composed through the methodology's type system.

Data deep dive: Composition as data wiring

Data Ownership per Product

ProductHarness LayersOwnsData Types
build.aiBuild + ProductPlatform dataRequests, sessions, agents, experiments (37 tables in main branch)
cortex.aiOperations + DomainTenant dataPer-tenant knowledge, workflows, learnings, domain data (branch-per-tenant)
way2do.aiDomainSubscription stateUser subscriptions, cross-app access grants (reads way2* domains)
marco.aiAll fourFull meshAuthenticated read/write to ALL harnesses
Data ownership rule

Data ALWAYS lives with its home app. The mesh READS across apps. WRITES go through a pending_changes approval queue. way2fly owns jump data. way2save owns transactions. build.ai orchestrates everything but never bypasses ownership. This is a config policy decision enforced at the mesh level.

Strategic deep dive: Product factory economics

Each product in Marco's config serves a different market through a different composition:

build.ai Platform

Build + Product. The developer hub and SaaS factory. Manages all engineering work.

cortex.ai B2B SaaS

Operations + Domain. Packages operational workflows for business customers. Each tenant gets their own harness composition. Full deep-dive →

way2do.ai Consumer Hub

Domain. Subscription hub bundling way2fly + way2move + way2save with cross-domain assistant. Full deep-dive →

marco.ai Everything

All four harness types. Mobile: domain + operations. Web: full platform control. The owner's view.

Portfolio synergy

Personal apps (way2fly, way2move, way2save) prove the model per vertical. cortex.ai packages it for businesses. way2do.ai bundles it for consumers. build.ai is the factory that builds everything. Each product strengthens the whole -- new products are new compositions, not new rewrites.

Each app uses different combinations of the four knowledge types:

build.ai

Uses Recipes + Blueprints. The factory that builds all the other apps.

cortex.ai

Uses Playbooks + Records. Gives businesses their own AI assistant with operational rules and data.

way2do.ai

Uses Records. Your personal hub connecting data from all your apps behind one smart assistant.

marco.ai

Uses all four types. The master control room -- sees recipes, blueprints, playbooks, and records.

🔗

The magic is in the combinations. A restaurant needs recipes + records. A construction firm needs blueprints + playbooks. Each business uses a different combination. harness.os works the same way -- each app picks the knowledge types it needs, and the platform provides them.

What is a Mesh

A mesh is a running instance of connected harnesses following a config. Think of config as the blueprint, mesh as the building. Marco's personal mesh has 6 apps connected through 18 harness instances. A Lake Deck mesh (cortex.ai tenant) has hospitality processes connected through their harnesses. One config can spawn multiple meshes. Apps are on a mesh, not the mesh.

MESH TOPOLOGY (MCP) APPS ON MESH build.ai cortex.ai way2do.ai marco.ai HARNESS INSTANCES Build (2) Product (6) Operations (5) Domain (5) 18 instances total NEON POSTGRES build-harness + marco-builder product-shared skydive / fitness finance / life / personal domain (planned) MCP Protocol 27 tools per instance 10 Neon branches Python MCP server
Technical deep dive: Mesh topology and connections

The mesh is the runtime. It is managed by harness-mesh.ts -- a multi-client MCP connection manager. Each harness instance is registered in harness_instances with its database_url, base_type, and tool_filter. When an agent queries a harness, the mesh manager spawns (or reuses) a Python MCP server process connected to the correct Neon branch.

Instance Registry

TypeInstancesNeon BranchIsolation
Buildbuild-harness, marco-builderbuild-harness, marco-builderSlug-filtered / dedicated
Productw2f, w2m, w2s, cortex, marco-ai, build-aiproduct-sharedSlug-filtered (6 on 1 branch)
Operationsskydive, fitness, finance, life-mgmt, marco-personal5 dedicated branchesBranch-isolated
Domainw2f-domain, w2m-domain, w2s-domain, cortex per-tenant, marco-domainPlannedPer-app branches (planned)

Multiple Meshes from One Config

The same config can spawn multiple meshes. Marco's personal mesh is one instance. A cortex.ai Lake Deck tenant mesh is another -- it uses a subset of the config (operations + domain types) with Lake Deck-specific harness instances. The config defines the architecture; each mesh is a deployment of that architecture with its own data.

class:instance analogy

Config = class definition. Mesh = object instance. marco-config defines harness types, connection patterns, and tooling. marco-mesh is the running system with real data. lake-deck-mesh is another instance of a cortex-tenant-config subset. Each mesh has its own data, its own learnings, its own accumulated intelligence -- but they share the same structural patterns because they follow the same methodology.

Data deep dive: Live data topology and flow

A mesh is the live data topology -- the actual running system where data flows between harness instances.

Mesh-Level Observability

TablePurposeKey Columns
mesh_eventsEvent streamevent_type, harness_id, payload JSONB, timestamp
mesh_transactionsCross-harness operationssteps JSONB, total_duration_ms, harness_ids[]

Multiple Meshes, Same Schema

Because the methodology defines the schema, every mesh -- regardless of which config spawned it -- uses the same tables. This means monitoring tools, query patterns, and analytics pipelines work across all meshes. A dashboard built for Marco's mesh works for Lake Deck's mesh. This is the data engineering payoff of the three-layer model.

Mesh transaction model

Cross-harness operations are logged in mesh_transactions with a steps JSONB array tracking each harness queried, response time, and data returned. This enables cost attribution per query fan-out and identification of slow harness instances. The pattern is the same across all meshes -- methodology-level observability.

Strategic deep dive: Mesh as running business

The mesh is the running business. Each mesh is independent but can share learnings with other meshes through the methodology's knowledge flow patterns.

18
Harness instances
10
Neon branches
27
MCP tools each
4
Harness types
Multiple meshes = multiple businesses

Marco's personal mesh is one business. Each cortex.ai tenant mesh is another business running on the same config subset. One config can power many meshes -- each with its own data, its own learnings, and its own revenue. This is the scaling model: configs spawn meshes, meshes generate value.

🌐

Think of the mesh as a nervous system. Your brain (the config) defines how your nervous system works. The mesh is the actual nervous system -- real nerves carrying real signals between real organs. When you ask "Can I afford skydive camp?", the mesh carries the question to three different apps (finance, skydiving, fitness), gets answers from each, and brings them back to you as one unified response.

Different people can have different nervous systems. Marco's mesh connects skydiving, fitness, and finance apps. A hotel's mesh would connect staff training, guest management, and compliance apps. Different organs, same nervous system design -- all following the harness.os methodology.

Cross-Domain Flows

The mesh makes cross-app reasoning possible. When you ask a question that spans fitness and finance, the mesh knows which harness instances to consult, routes the queries, and combines the answers. No single app can do this alone -- it's the mesh that makes the whole greater than the sum of its parts.

Cross-App Scenario
"Can I afford skydive camp?"
User asks marco.ai a question that spans three domains. The mesh orchestrates queries across way2save, way2fly, and way2move -- returning a unified answer from a single prompt.
marco.ai
Can I afford the April skydive camp?
Checking 3 apps... Budget, camps, readiness
Query
harness mesh
Fan-out
way2save
Camp budget$1,200
Available$1,450
Budget OK
way2fly
Next campApr 12
PrereqB-License
Qualified
way2move
Readiness92%
Last sessionToday
Ready
Under the hood: marco.ai's assistant sends one prompt. The mesh fans out to 3 MCP servers: finance-harness (budget check via spine_rules), skydive-harness (camp schedule + prereqs from cortex_chunks), and fitness-harness (readiness score from fed_activities). Results merge in the agent's context window. Total mesh latency: ~200ms fan-out + per-branch query time.
Data flow: Three parallel MCP queries hit three Neon branches. finance-harness returns budget row from fed_transactions. skydive-harness returns camp dates + progression rules from cortex_chunks + spine_rules. fitness-harness returns readiness metric from fed_activities. All tracked in mesh_transactions.steps[] with per-branch latency.
Why this matters: No single app can answer "Can I afford skydive camp?" alone. way2save knows the budget but not the prereqs. way2fly knows the camps but not the budget. way2move knows readiness but not the schedule. The mesh makes cross-domain reasoning possible from a single question -- the core competitive advantage.
What just happened: You asked one question to your personal assistant. Behind the scenes, it checked your bank account (way2save), found upcoming camps and checked your qualifications (way2fly), and verified you're physically ready (way2move). Three apps, one answer: "Yes, you can afford it, you're qualified, and you're in great shape. Book it."
Inter-App Data Flow
"Morning training flow"
User completes a way2move mobility session. Compensation data flows through the mesh to way2fly, updating skill prerequisites and adapting the next suggested tunnel session.
way2move
Session Complete
TypeMobility
Duration25 min
Hip ROM+12%
Spine flexGood
COMPENSATION DATA
Body arch readiness: 94%
Mobility
data
harness mesh
Skill
update
way2fly
Skill Tree Updated
Body archUnlocked
Prereq metHip flex
NEXT SUGGESTED
Tunnel: arch-to-backfly drill
Adapted from mobility gains
Under the hood: way2move writes the completed session to fed_activities in its domain harness. A spine_rules trigger in the fitness-harness fires on "mobility_session_complete", pushing compensation metrics to the mesh. The skydive-harness spine_workflows("skill-prerequisites") evaluates the new data and updates the skill tree node status. way2fly's UI polls or receives a WebSocket push of the updated prerequisite state.
Data flow: Write path: fed_activities INSERT in fitness-harness branch. Read path: skydive-harness queries fed_activities via cross-harness MCP call for compensation metrics, evaluates against spine_rules where triggers[] @> '{skill-prereq-check}', and writes updated skill status to learnings table. Logged in mesh_transactions as a 2-step cross-harness operation.
Why this matters: Training in one app automatically unlocks progression in another. The user does not need to manually update anything -- the mesh connects the products intelligently. This is the cross-domain reasoning that makes the mesh valuable.
What just happened: You finished a 25-minute stretching session on your training app (way2move). Because your hip flexibility improved, your skydiving app (way2fly) noticed you now meet the requirements for a new aerial skill. It automatically suggests the right tunnel drill to practice. Two apps, zero manual updates, one connected experience.

From Products to Partnerships

The three layers create three levels of composability. The methodology lets anyone build products. The config's products are themselves platforms (cortex.ai lets companies compose, way2do.ai lets consumers compose). Each platform product spawns its own meshes. Methodology creates products. Products create partnerships. Partnerships create ecosystems -- each a running mesh.

Partnership Pipeline
"cortex.ai tenant onboarding"
Admin creates a request in build.ai. Agents provision a new cortex.ai tenant -- a new mesh spawned from the config -- seeding the knowledge base, configuring operations harnesses, and spinning up the tenant dashboard.
build.ai
New Request
Onboard: Lake Deck
Typetenant-provision
Industryhospitality
Mesh typeoperations+domain
PHASE 1/3
Provisioning mesh...
PHASE 2/3
Seed knowledge base
PHASE 3/3
Configure agents
Agents
provision
harness mesh
New mesh
created
cortex.ai
Lake Deck Mesh
3
PROCESSES
12
RULES
2
AGENTS
OnboardingActive
ComplianceActive
SchedulingActive
Under the hood: The build.ai request pipeline activates a 3-phase tenant provisioning template. Phase 1: neon_create_branch(parent="main", name="lake-deck") + register in harness_instances. Phase 2: MCP bulk_insert seeds cortex_chunks with hospitality domain knowledge. Phase 3: agent_process_assignments configured with tenant-scoped agents. The result: a new mesh instance spawned from the cortex.ai config subset.
Data flow: New Neon branch created (copy-on-write from parent). harness_instances row inserted with base_type='operations'. Knowledge seeded: 12 spine_rules (hospitality), 3 spine_workflows. Cross-tenant learnings with transferability_score > 0.7 pre-loaded. A new mesh (data topology) is now live.
Why this matters: A new partner goes from "signed contract" to "live mesh with AI agents" through a single build.ai request. Each tenant mesh is independent but shares learnings through the methodology's transferability model. The second tenant setup was faster because the process templates were already refined.
What just happened: A new hotel company (Lake Deck) signed up. An admin clicked "onboard" and the system automatically set up everything the hotel needs: its own running system (mesh) with staff training workflows, compliance checklists, and scheduling rules -- pre-loaded with hospitality knowledge. Ready to use from day one.
Technical deep dive: Partnership architecture

Mesh Separation (Future Architecture)

As a partnership mesh grows, it can graduate to a fully independent mesh:

Mesh-to-mesh communication (planned)
// Today: single Neon project, partition-scoped meshes
harness-os-mcp -> single Neon project -> branch-per-harness

// Future: federated meshes
lake-deck-mesh -> own Neon project -> own branches -> own agents
  <-> MCP bridge (cross-mesh queries via authenticated endpoints)
marco-mesh -> own Neon project -> retains full visibility

// The MCP protocol already supports this: each mesh is a set of MCP servers.
// Federation = routing queries to the right mesh's MCP endpoint.
Strategic deep dive: Partnership and revenue model
The three-layer business model

Methodology = teachable/licensable. Config products = platform products that each spawn meshes. Meshes = running businesses. cortex.ai turns the methodology into a SaaS product where each tenant is a new mesh. way2do.ai turns it into a consumer product where each subscriber connects domain meshes. The methodology creates products, products create partnerships, partnerships create meshes.

🤝

The franchise model: The methodology is the franchise playbook. Marco's config is the first franchise location. Each cortex.ai tenant (Lake Deck, Aluminex) is a new franchise location with its own running system (mesh). They share the playbook, benefit from each other's learnings, but operate independently. If a location grows big enough, it can become its own franchise chain -- its own complete system that still trades knowledge with the original.

Agent Architecture

Every agent has two halves: the outer harness (knowledge AND process — what it knows AND how work should be done) and the inner harness (the thin runtime connector — just enough to read the outer harness and execute). Most of "how it runs" lives in the outer harness as rules, workflows, and process definitions. The inner harness is deliberately minimal — accept context, call model, route tools — so it's trivially swappable. An agent's full intelligence survives even if you replace the AI model powering it.

Architecture deep dive: Agent execution model

Outer Harness (Knowledge from the Mesh)

From Build Harness

Dev patterns, coding standards, CI/CD workflows. Sourced from cortex_chunks + spine_rules in build branches of the mesh.

From Product Harness

Architecture decisions, feature specs, roadmaps. Sourced from product-shared branch with slug filtering on the mesh.

From Operations Harness

Operational workflows, domain rules, triggers. Sourced from dedicated operations branches on the mesh.

From Domain Harness

User data context -- recent jumps, current training plan, account balance. Makes agent responses personal.

Inner Harness (Execution -- Config Choice)

TypeImplementationBest ForStatus
cli-spawnedclaude --print --stream-json -p "<prompt>"Code tasks, file creation, reviewBuilt
api-built@anthropic-ai/sdk direct API callsAssistant chat, data queriesBuilt
third-partyGitHub Copilot, OpenAI Codex, webhooksSpecialized external toolsPlanned

The inner harness type is a config choice. A different config might use only API-built agents, or only third-party tools. The outer harness (knowledge from the mesh) stays the same regardless of execution method.

Data deep dive: Agent data model

Agent Data Model

Schema
agents (id, name, type, capabilities TEXT[], status, system_prompt, model_preference)
agent_implementations (id, agent_id, impl_type, model, active, stats JSONB)
agent_knowledge (id, agent_id, knowledge_type, content, domain)
agent_process_assignments (agent_id, phase_template, role)
Compound learning

Agent knowledge compounds in the mesh over time. After 100 sessions, the learnings table in each harness contains patterns that improve future prompt construction. Build harness learnings improve all code agents on the mesh. Operations harness learnings improve all operational agents. Compound effects are per-harness-type, not per-agent -- a new agent on the mesh inherits accumulated domain wisdom immediately.

Strategic deep dive: Agent capabilities

Agents come in three roles, each drawing knowledge from different harness types on the mesh:

Orchestrator

Decomposes complex requests into phases. Routes to specialist agents. One per pipeline.

Lead

Owns a pipeline phase end-to-end. Can delegate sub-tasks to workers.

Worker

Executes specific, scoped tasks. Reports up to the lead.

Agent factory model

Agents are developed once in build.ai, then deployed across meshes. A "debrief coach" agent developed for way2fly can be activated in a cortex.ai tenant mesh for hospitality debriefs -- same behavior, different operations harness knowledge from a different mesh. Build once, deploy across meshes.

👷

Think of agents as smart workers. Each agent gets two things from the harness: the knowledge it needs (recipes, playbooks, records) AND the instructions for how work should be done (step-by-step procedures, quality rules, handoff checklists). All of that lives in the mesh. The worker itself — the actual person or tool doing the job — just needs to be capable enough to follow instructions and use knowledge. You can swap out the worker without losing any intelligence, because the intelligence is in the harness, not in the worker.

Agents come in three levels: managers (break big tasks into smaller ones), leads (own one part of the work), and workers (do specific tasks). Every time an agent completes a task, the mesh learns from the experience.

The Inner Harness is Solved

The key realization

The inner harness — the execution engine, the agent that runs tasks — is a solved problem. Claude Code exists. Copilot exists. Custom API agents are straightforward to build. build.ai creates inner harnesses as part of its pipeline. These tools keep getting better every month as new models and tools ship.

Inner Harness (Thin Connector)

The minimal runtime. Deliberately thin so it's trivially swappable.

  • Claude Code (CLI-spawned agents)
  • Copilot, Cursor, Windsurf
  • Custom API-built agents (Anthropic, OpenAI)
  • Third-party tools (Codex, Devin, etc.)
  • Future: whatever ships next quarter

The inner harness only needs to do three things: read context from the outer harness, call a model, and route tool calls back. That's a standard MCP interface — any tool that speaks MCP can be the inner harness.

Outer Harness (The Full Intelligence)

Knowledge AND process. What it knows AND how work should be done.

  • Structured knowledge (four types, CNS schema)
  • Process definitions — rules, workflows, phase templates
  • Accumulated learnings (compound over time)
  • Session lifecycle (start → context → work → learn → handoff)
  • Cross-domain reasoning across the mesh

Most of "how it runs" lives here — as rules and workflows, not as code. This is what makes the inner harness swappable: the intelligence is in the harness, not in the tool.

What this means for the methodology

Software exists to improve processes. AI is a new element participating in that improvement. The outer harness is data organization for this new element — structuring knowledge so AI can read it, use it, learn from it, and improve the processes it participates in.

The inner harness only needs the minimum interface to connect: read knowledge, receive rules and workflows, write learnings back. That's it — a standard MCP connection. By defining all process logic in the outer harness, the inner harness stays thin enough to be swapped without losing anything.

This is why creating an "autoharness" (a new inner harness from scratch) is less valuable than creating a good outer harness that any inner harness can connect to. The energy should go into the knowledge and process layer, not the execution connector.

Technical deep dive: Why the outer harness is the moat

The harness.os-mcp server exposes 27+ tools per harness instance. An inner harness (any MCP-compatible client) connects and gets:

MCP Tool CategoryWhat It ProvidesWhy It Matters
start_sessionLast handoff, current state, project contextAgent starts where the last one left off — not blank
get_rulesApplicable rules for current activityAgent follows established patterns without being told
search_knowledgeRelevant knowledge chunksAgent has domain expertise it was never trained on
get_workflowStep-by-step proceduresAgent follows consistent processes
log_learningPersist insights for future sessionsEvery session makes the next one better
end_sessionHandoff summary for next agent/sessionContinuity across agents and time

The inner harness doesn't need to know about Neon branches, PostgreSQL schemas, or the four-type system. It just calls MCP tools. The outer harness handles all the knowledge architecture — including the process definitions that tell the agent HOW to work. The inner harness is deliberately thin: it's a generic connector, not a brain. The brain is the outer harness.

The replacement test

If Claude Code disappears tomorrow, the outer harness — every knowledge chunk, every rule, every learning, every workflow — survives intact. Connect a different inner harness (Copilot, a custom agent, a future tool that doesn't exist yet) and it picks up where Claude left off. That's the proof that the value is in the outer harness.

📚

Think of it like a franchise manual and a new hire. The outer harness is the complete franchise system — the recipes, the quality standards, the step-by-step procedures, the lessons from every previous shift. The inner harness is the new employee — someone who comes in, follows the manual, and adds their own notes about what they learned. You can replace the employee (different AI tool, different model) and the franchise system stays the same. The intelligence is in the system, not in the individual worker. The worker just needs to be capable enough to follow instructions.

Request Pipeline

When someone asks the platform to do something, the request gets broken into phases, each assigned to the right agent, executed in sequence with real-time streaming output. Each phase queries the relevant harness types on the mesh for context. The pipeline is the execution layer that connects requests to the mesh.

Inbox New request Activate Select template + mesh scope Pipeline Phases Discovery Design Specs Build Wiring Adapters E2E Deploy Each phase queries mesh harness types for context Done Artifacts ready Build mode: build + product harnesses · Operations mode: operations + domain harnesses
Technical deep dive: Request lifecycle

Requests enter from multiple sources and flow through a template-driven pipeline. The pipeline's mesh scope depends on the mode: build mode activates build + product harnesses; operations mode activates operations + domain harnesses.

Phase State Machine

pending → active → ┌ completed
                      ├ skipped
                      └ failed → retry → pending

Participation Modes

ModeBehavior
human-in-loopPause after each phase. User reviews, approves next.
ai-in-loopAuto-start next phase on complete. Full pipeline runs unattended.
prototypeSimulated. No real agent calls. Tests pipeline design.
Data deep dive: Pipeline data flow

Pipeline Data Tables

TableRoleKey Fields
requestsWork itemstitle, priority, status, business_model, project_id
request_phasesPipeline phasesrequest_id, template_phase, status, agent_id, session_id
sessionsExecution runsphase_id, input_tokens, output_tokens, cost, duration, output_lines JSONB
process_templatesPipeline definitionsbusiness_model, request_type, phases JSONB
Session output structure

Session output stored as JSONB arrays of {type, content, timestamp} objects. Enables post-hoc analysis, cost attribution per phase, and learning extraction. Learnings are written back to the originating harness type on the mesh.

Strategic deep dive: Workflow as product feature

How It Works

  1. Request arrives -- from any source (web UI, mobile, CLI, API)
  2. Template selected -- based on business model + request type, selects relevant harness types on the mesh
  3. Phases execute -- each phase gets an assigned agent that queries the mesh for context
  4. Artifacts generated -- each phase produces deliverables
  5. Review and done -- user reviews output, request marked complete
Three ways to run

Human-in-the-loop: Review after each phase (critical work). AI-in-the-loop: Phases auto-advance (routine tasks). Prototype: Simulates without running agents (testing new templates).

🏭

Think of it as an assembly line for work. When you ask the system to do something, it breaks the work into steps. Each step has a specialist worker (agent) assigned to it. The worker consults the right knowledge types on the mesh to do its job well.

You can choose how much control you want: review each step (the system pauses and asks you to approve), or let it run (the system completes all steps automatically).

Scale Tiers

The methodology stays the same at every scale. What changes is the implementation — how knowledge is stored, how harnesses communicate, and how the mesh is managed. Files work for one person. Databases work for a team. Federated APIs work for an enterprise. The four types (build, product, operations, domain), the internal structure (knowledge, rules, workflows, learnings), and the session lifecycle are identical at every tier.

SCALE TIERS — SAME METHODOLOGY, DIFFERENT IMPLEMENTATION TIER 1 Solo / Local CLAUDE.md + rules/ Markdown files Git for versioning 1 person, 1-3 projects COMMUNICATION File imports (@rules) Copy-paste between repos COST $0 (just files) TIER 2 Solo / Database PostgreSQL (Neon branches) CNS schema (37 tables) MCP server per harness 1 person, 3-10 projects COMMUNICATION MCP (local processes) Mesh manager (lazy connect) COST ~$20/mo (Neon Pro) TIER 3 Team / Production Hosted DB + remote MCP Auth + access control Multi-tenant isolation 2-20 people, multi-team COMMUNICATION Remote MCP (Streamable HTTP) API gateway + service mesh COST ~$200/mo (infra + DB) TIER 4 Enterprise / Federated Federated DB clusters SSO + RBAC + audit logs Cross-org mesh federation 20+ people, departments COMMUNICATION Federated APIs + event bus Cross-mesh learning sync COST $2K+/mo (platform)

Tier 1 — Files & Conventions (Solo / Local)

When: 1 person, 1-3 projects, starting out or experimenting.

Storage: Markdown files in the repo. CLAUDE.md at the root, .claude/rules/ for domain rules, docs/ for decisions and specs. Knowledge lives in files, organized by convention.

Harness types in practice:

  • Build: CLAUDE.md + .claude/rules/ — coding standards, workflow, architecture
  • Product: docs/ARCHITECTURE.md, docs/phases/ — specs, roadmap, decisions
  • Operations: docs/domain/ — domain knowledge, process descriptions
  • Domain: App databases (Firebase, Postgres) — runtime user data, not in files

Communication: File imports (@.claude/rules/testing.md), copy shared rules between repos manually.

Session lifecycle: Implicit. The agent reads files on startup, you update files manually after decisions. No formal start/end session.

Mesh: No mesh. Each project is isolated. Cross-project knowledge is manual copy-paste.

Move to Tier 2 when: You have 3+ projects and find yourself copying rules between repos, or you need knowledge to compound across sessions (learnings from project A should help project B automatically).

Tier 2 — Database & MCP (Solo / Power User)

When: 1 person, 3-10 projects, knowledge needs to compound and flow between contexts.

Storage: PostgreSQL with the CNS schema. Each harness instance gets its own database or branch (Neon copy-on-write branches are ideal). Knowledge is structured, queryable, and embeddable (VECTOR(1536) for semantic search).

Harness types in practice:

  • Build: cortex_chunks (coding standards, patterns), spine_rules (triggers for workflow enforcement), spine_workflows (8-phase dev process)
  • Product: cortex_chunks (architecture, specs), scoped by project_slug per product
  • Operations: Dedicated branches per domain (skydive-harness, fitness-harness) with domain-specific rules and workflows
  • Domain: fed_* tables or per-app databases (Firebase Firestore, etc.)

Communication: MCP (Model Context Protocol). One Python server per harness instance, spawned on demand. Mesh manager handles lazy connect, idle eviction, stale retry.

Session lifecycle: Explicit. start_session() loads handoff + rules. end_session() persists decisions and learnings. Knowledge compounds across sessions.

Mesh: Local mesh. Harness instances connected via MCP on the same machine. Cross-harness queries logged in mesh_transactions.

This is Marco's current tier. 6 apps, 18 harness instances, 10 Neon branches, local MCP mesh.

Move to Tier 3 when: A second person needs access to the mesh, or you're deploying harness-backed agents in production (real users, not just dev-time).

Tier 3 — Remote MCP & Multi-Tenant (Team / Production)

When: 2-20 people, multiple teams, production agents serving real users.

Storage: Hosted PostgreSQL (Neon, Supabase, RDS) with per-tenant branch isolation. Connection pooling (PgBouncer). Automated backups and point-in-time recovery.

Harness types in practice: Same schema as Tier 2, but with access control. Each team member has scoped access. Build harness is shared (company-wide standards). Product harnesses are team-scoped. Operations harnesses are department-scoped. Domain harnesses are user-scoped with row-level security.

Communication: Remote MCP servers via Streamable HTTP transport. API gateway for auth and rate limiting. Service discovery so agents can find harness instances.

Session lifecycle: Same protocol, but with auth context. Sessions carry user identity. Learnings are attributed. Conflict resolution for concurrent sessions on the same harness.

Mesh: Remote mesh. Harness instances are services, not local processes. Multiple agents can query the same mesh simultaneously. This is what cortex.ai tenants use.

New requirements:

  • Authentication — who is querying which harness?
  • Authorization — role-based access to harness instances
  • Observability — dashboards for mesh health, query latency, learning accumulation
  • Versioning — harness schema migrations across tenants

Move to Tier 4 when: Multiple departments need independent meshes that share learnings, or you're federating across organizations (partner meshes).

Tier 4 — Federated Mesh (Enterprise / Multi-Org)

When: 20+ people, multiple departments or organizations, cross-mesh learning is a competitive advantage.

Storage: Federated database clusters. Each department or partner gets their own database cluster. Schema is enforced by the methodology but storage is independent. Data sovereignty respected — no cross-org data copying without explicit consent.

Harness types in practice: Same four types, but meshes are independently operated. A company mesh has build + product + operations + domain. A partner mesh has operations + domain. Shared learnings flow through a federation protocol, not direct database access.

Communication: Federated APIs + event bus. Cross-mesh learning sync via pub/sub (learnings with high transferability_score are published to a shared topic). Each mesh subscribes to relevant topics. MCP is still used within a mesh; APIs are used between meshes.

Session lifecycle: Same protocol. Sessions are mesh-local. Cross-mesh interactions are async (learning sync, not real-time queries).

Mesh: Federated mesh. Multiple independent meshes that share learnings through a controlled protocol. Each mesh is autonomous. The federation layer adds cross-mesh intelligence without coupling.

New requirements:

  • SSO + RBAC — enterprise identity, department-level permissions
  • Audit logging — who accessed what, when, for compliance
  • Federation protocol — how learnings flow between meshes (consent, filtering, attribution)
  • Schema governance — methodology evolution across independent meshes
  • Multi-region — data residency requirements per mesh

Status: Designed but not built. This is the target architecture for cortex.ai at scale — each tenant is a Tier 3 mesh, federation between tenants is Tier 4.

What Never Changes Across Tiers

Four Types

Build, Product, Operations, Domain. At every scale. A file-based build harness and a database-backed build harness serve the same purpose — they store creation knowledge.

Internal Structure

Knowledge, rules, workflows, learnings. At Tier 1 it's markdown sections. At Tier 2+ it's database tables. Same structure, different storage.

Session Lifecycle

Start (load context) → work → end (persist learnings). At Tier 1 it's reading files. At Tier 2+ it's start_session() / end_session(). Same pattern.

Tier mapping: what tools at each scale
ComponentTier 1 (Files)Tier 2 (DB+MCP)Tier 3 (Team)Tier 4 (Enterprise)
Knowledge storeMarkdown filescortex_chunks + pgvectorHosted Postgres + pgvectorFederated Postgres clusters
Rules engine.claude/rules/ filesspine_rules tablespine_rules + triggersspine_rules + policy engine
Workflowsdocs/ markdownspine_workflows JSONBspine_workflows + schedulerspine_workflows + orchestrator
LearningsManual noteslearnings tablelearnings + scoringlearnings + cross-mesh sync
CommunicationFile importsLocal MCP (stdio)Remote MCP (HTTP)MCP + federated APIs
AuthNone (local)None (single user)JWT + RBACSSO + RBAC + audit
IsolationRepos/foldersDB branches + slugsTenant branchesSeparate clusters
Agents connect viaReading filesMCP toolsRemote MCP toolsAPI + MCP

Real-World Adoption Path

Where I am right now — honestly

I'm at Tier 2 personally (database + MCP, 6 apps, 18 harness instances). I'm beginning to apply this at my company (legal tech — wills, trusts, POA automation) through a series of workshops. The goal: move the dev team from Tier 1 to Tier 2, then expand to company-wide Tier 3. Here's the actual plan and current progress.

Current Position on the Scale

Context Current Tier What Exists What's Next
Marco's personal projects Tier 2 6 apps, 18 harness instances, 10 Neon branches, local MCP mesh, build.ai orchestrating agents Compound learning metrics, cross-mesh query patterns
cortex.ai tenants Tier 2→3 2 tenants (Lake Deck hospitality, Aluminex manufacturing), per-tenant isolation Remote MCP, more tenants, cross-tenant learning
Company (legal tech) Tier 1 Developers using AI tools with basic prompts. No structured outer harness. No shared knowledge. Workshop series → team Build harness → company-wide mesh

The Workshop Sequence — How You Actually Adopt This

This is the sequence being used at the legal tech company. It's designed to be repeatable for any team.

Workshop 1 — Inner Harness Basics COMPLETED

Teach the team what the inner harness IS. How AI tools work as thin runtime connectors — accept context, call model, route tools. Demo Claude Code, Copilot, and custom agents. Key takeaway: the inner harness is a solved problem. These tools exist, they're getting better every month, and the team should use them.

Outcome: Team understands that using AI tools effectively isn't about which tool — it's about what you give the tool to work with.

Workshop 2 — Outer Harness Concepts IN PROGRESS

Teach structured knowledge AND process definitions that make agents effective. Different kinds of outer harness content: knowledge, rules, workflows, learnings, process definitions. Key takeaway: the differentiation is in the outer harness — the full intelligence, not the thin connector. A well-organized outer harness makes ANY inner harness dramatically more effective.

Outcome: Team sees the gap between "using Claude with no context" and "using Claude with structured knowledge." The difference is visible and compelling.

Workshop 3 — My Outer Harness as Demo PLANNED

Show my actual personal dev workflow harness to the team. Live demo: session lifecycle, knowledge persistence, how agents improve over time. Then two critical exercises:

  • Identify what applies to the team — Which parts of my personal outer harness would make the team more effective? Coding standards, architecture patterns, testing rules, CI/CD workflows.
  • Identify what needs to change — What's personal vs team? What needs multi-user support? What's missing for the legal domain?

Outcome: Team sees a working Tier 2 system and identifies what they want for themselves.

Workshop 4+ — Team Development Harness PLANNED

Build the team's outer harness together. Start with a Build harness (dev workflow, coding standards, architecture rules). Then Product harnesses per project. The team co-creates this — it's not imposed from above, it's built from what they agree makes them more effective.

Outcome: Team has their own Tier 1 outer harness (files/conventions). Foundation for moving to Tier 2.

Expansion — Company-Wide harness.os FUTURE

Scale from dev team harness to department harnesses to company mesh. The legal domain is a strong fit for Operations harnesses: structured processes, compliance requirements, document workflows, approval chains. Eventually: the company's legal processes become Operations harnesses that any inner harness can plug into.

End state: Dev team at Tier 2 (DB + MCP), company at Tier 3 (remote MCP, multi-team). harness.os methodology validated at real company scale.

Why This Sequence Works

Show, don't pitch

Workshop 3 shows a working system, not slides. People believe what they see.

Co-create, don't impose

The team builds their own harness. Adoption happens because they chose it, not because it was mandated.

Start at Tier 1

Files first. No infrastructure needed. The team can start tomorrow with CLAUDE.md files. Move to database when the need is obvious.

Dev team first

Prove it where you have control before expanding to departments where you need buy-in.

What I'm NOT claiming

I'm not claiming this is finished, or that it works for everyone, or that the company needs to adopt the entire methodology. I'm saying: here's what I built for myself (Tier 2), here's what I learned, and here's what I think could help us as a team. The methodology scales down to "just use CLAUDE.md files effectively" and scales up from there. You take what's useful.

The Journey

Why this section exists

harness.os was not designed top-down. It was discovered bottom-up, through months of building real products with AI. Every concept emerged from a real problem. This is the trajectory — the sequence of realizations that turned scattered files into a methodology.

Phase 1 — Three Projects, Files Everywhere

Got an AI coding subscription. Started building three personal projects simultaneously — a skydiving app, a fitness app, and a finance app. Discovered that the AI tool needed structured context to be useful. Started organizing with files: CLAUDE.md at the root, rules folders, decision docs. Tier 1 of the methodology, discovered by necessity.

Phase 2 — The Artificial Anatomy

The files started to have structure: knowledge storage (cortex), structural rules (spine), workflows (nervous system), learnings (memory). Named it the "Artificial Anatomy of AI" — a CNS (Central Nervous System) metaphor. The internal harness structure was born.

Phase 3 — Dashboards & Personal Assistant

Built dashboards (build.ai) to control all projects from one place. Created a personal assistant (marco.ai) to read and edit those files. The files were now connected — decisions in one project could reference learnings from another. The mesh concept started forming, even without the name.

Phase 4 — Cross-App Data & Multi-Tenant Vision

Realized the data across all three apps should be structured so an assistant could read and write across all of them. Also realized that apps need more dynamic, feedback-driven development — metrics, usage data, and user requests through assistants should drive improvement. Conceived cortex.ai: the same brain, packaged for any company. The config/mesh distinction emerged — one methodology, many deployments.

Phase 5 — The Outer Harness Wins

Started building custom agents. Tested them against existing agents (Claude Code, Copilot). Found that the custom agent kept getting smaller — the runtime connector (inner harness) mattered less and less. Most of "how it runs" belonged in the outer harness as rules and workflows. What mattered was the full intelligence (outer harness) — knowledge AND process definitions — that any thin connector could plug into. Key insight: the real value is persistent, structured intelligence that outlives any specific AI model.

Phase 6 — Process Improvement Is the Real Game

Software exists to improve processes. AI is a new element that helps with what we've always done — but now we need to structure software better for this new element to participate. It needs to store things, retrieve things, learn, and improve processes from the inside. Continuous process improvement is now real, and AI accelerates it. harness.os is not a product — it's a methodology for organizing AI knowledge so processes continuously improve.

Phase 7 — The Four Types Crystallize

All knowledge falls into four categories of process: creation (build), discovery and specification (product — WHY + WHAT), domain operations (operations), and user data (domain — WHO). Product includes the full lifecycle: discovery → building → maintaining → archived, with continuous validation throughout. These four cover most of what people and companies need. They're customizable at each level, plug-and-play. New products are just new compositions of these types. The type system was the last piece — making harness.os a complete, composable methodology.

Now — Validation at Scale

The methodology works at Tier 2 (database + MCP, single user, 6 apps). Next: apply it at a real company (legal tech, 20+ people), validate Tier 3, and prove that the four types hold across industries. The hypothesis: the methodology is universal. The implementation scales. The types are complete. This is being tested, not assumed.

The Compound Effect

The three-layer model creates a compounding flywheel. The methodology enables configs. Configs spawn meshes. Meshes generate learnings. Learnings with high transferability flow back through the methodology's knowledge flow patterns, making every other mesh smarter. More meshes = more data = more intelligence = more meshes.

Compound Intelligence
"Cross-domain learning"
A pattern discovered in way2fly -- "athletes who warm up with way2move score 30% higher on progression checks" -- gets surfaced to cortex.ai tenant meshes as a cross-domain insight, and marco.ai uses it to suggest pre-session routines. Knowledge flows across meshes.
way2fly
Pattern Detected
Athletes who warm up with way2move score 30% higher on progression checks
Sample size47 sessions
Confidence94%
New Learning
Learning
harness mesh Cross-domain
Propagate
cortex.ai
Cross-Mesh Insight
"Pre-activity warm-ups improve performance 30%"
Transferable
marco.ai
Suggestion: Do a 15-min way2move warm-up before your tunnel session tomorrow
Based on 47-session study
Under the hood: The skydive-harness learnings table accumulates session outcomes correlated with pre-session activities via mesh_transactions. When the pattern reaches statistical significance, it's written with transferability_score=0.92. cortex.ai tenant meshes with process_type='training' inherit the generalized form. marco.ai's assistant queries learnings WHERE transferability_score > 0.7 for proactive suggestions.
Data flow: Learning record INSERT in skydive-harness: {category: 'cross-domain', insight: '...', transferability_score: 0.92}. Cross-mesh query: cortex.ai operations harness reads learnings WHERE transferability_score > 0.7. marco-personal-harness reads same learnings for proactive suggestions. No data copying -- federated reads across mesh branches.
Why this matters: A pattern discovered in one personal app (way2fly) becomes intelligence that benefits business customer meshes (cortex.ai tenants) and the personal assistant (marco.ai). Knowledge flows across the entire ecosystem of meshes. This is the compound learning moat.
What just happened: Your skydiving app noticed that people who stretch before flying do much better. That insight was shared with businesses using the system (a hotel could apply it: "staff who do morning briefings perform 30% better"). And your personal assistant now reminds you to warm up before your next session. One discovery, multiple running systems benefit.
Technical deep dive: Compound learning mechanics

Today

  • 4 harness types, 18 instances across 10 Neon branches
  • build.ai web UI -- Build + Product mesh scope
  • Federated domain data in main DB
  • cli-spawned + api-built agents
  • Single-user (Marco) mesh

Tomorrow

  • + cortex.ai: Operations + Domain as tenant meshes
  • + way2do.ai: Domain as consumer mesh subscriptions
  • + marco.ai: All 4 types (mobile + web meshes)
  • + Per-app domain harnesses (Neon Pro)
  • + Cross-mesh learning federation
  • + Anyone can define their own config and spawn a mesh
The composition thesis through the three-layer lens

Today: one config, one primary mesh. Tomorrow: the config itself becomes configurable. cortex.ai already demonstrates this -- each tenant runs their own mesh. The end state: the methodology is the specification, configs are published and forkable, and meshes are provisioned on demand. Anyone can define which harness types they need, and the platform provisions a mesh automatically.

Strategic deep dive: The compound moat
The strategic bet

The three-layer model is the moat. The methodology is the IP. The config creates the product portfolio. The meshes generate compound value. Personal app meshes prove each vertical. cortex.ai meshes package it for B2B. way2do.ai meshes bundle it for B2C. build.ai is the factory that creates configs and provisions meshes. New products are new compositions in the config. New customers are new meshes. The factory feeds itself.

The long game

Today, configs are predefined. Tomorrow, the config becomes a product feature. Imagine: a cortex.ai customer choosing which harness types to activate for their mesh. Or a way2do.ai user selecting which domain harnesses to subscribe to. The three-layer model makes this inevitable -- methodology provides the rules, config provides the options, mesh provides the running system.

🚀

Where this is headed: Today, one person runs one system with 6 connected apps. Tomorrow, anyone can create their own version. Businesses can pick which operational knowledge they need and get a running system (mesh) in days. Consumers can pick which apps to connect and get a personal assistant that understands their whole life. The playbook (methodology) stays the same. The game plans (configs) multiply. The running systems (meshes) compound in value.

Why This Works

Every claim in this documentation has been tested against the hardest question: 'Is this actually different, or just complicated?' The three-layer model (methodology, config, mesh) is the answer -- it separates the universal from the specific from the running, making the system both principled and practical.

How harness.os compares to existing AI orchestration platforms -- and why the three-layer separation matters.

Competitive Landscape

PlatformWhat It DoesWhat It Lacks
CrewAIMulti-agent orchestration with role-based agents and task delegationNo persistent knowledge layer. No methodology/config/mesh separation. Agents start blank every run.
LangChain / LangGraphLLM application framework with chains, agents, memory, and graph-based workflowsToolkit, not a methodology. Knowledge embedded in code, not composable. Memory is per-conversation, not per-domain.
AutoGen (Microsoft)Multi-agent conversation framework for collaborative task solvingResearch-oriented. No persistent learning. No config/mesh distinction. No multi-product composition.
Dify / FlowiseAIVisual AI workflow builders with drag-and-drop pipeline designSingle-tenant, single-app. No knowledge mesh. No cross-domain reasoning. No compound learning.
harness.osThree-layer AI knowledge platform: universal methodology, portable configs, running mesh instances with compound learningEarly stage. Single-user origin. Economics thesis unproven at scale.
Technical deep dive: What's architecturally original

What's Architecturally Original

Three-layer separation

No other platform separates methodology from config from runtime mesh. This separation means the universal principles (CNS schema, type system, session lifecycle) are independent of any specific tech stack, domain, or deployment. Configs are portable. Meshes are independent. This is the architectural innovation that makes everything else possible.

Persistent knowledge mesh vs. stateless agents

Every competitor treats agents as stateless executors. harness.os inverts this -- each harness instance on the mesh is a persistent knowledge store. Agents don't start blank. They start with everything the mesh knows. A manufacturing safety review agent in a cortex.ai mesh has access to every safety learning from every previous review -- automatically.

Branch-level isolation vs. namespace tricks

Multi-tenancy in most AI platforms means filtering by tenant ID. harness.os uses Neon branch-level isolation -- each harness instance can be a physically separate database branch. This is PostgreSQL-native isolation, not application-layer filtering. Each tenant mesh is genuinely independent.

Technical Risks -- Acknowledged

Risk: "One person built this"

The three-layer model is the answer. The methodology reduces complexity because products are compositions of the same config, and meshes are instances of the same patterns. A new SWE works on one MCP server that powers everything, not 6 separate apps.

Risk: "Neon free tier limits (10 branches)"

Config constraint, not methodology constraint. Neon Pro ($19/mo) unlocks unlimited branches. The mesh connection management was designed for hundreds of branches from day one.

Risk: "MCP is young"

The harness-os-mcp server is a thin Python layer over standard PostgreSQL. If MCP evolves, the mesh adapter changes -- the methodology's schema, data, and knowledge don't. The 37 database tables are the real asset.

Data deep dive: What's original about the data layer

What's Original About the Data Architecture

Knowledge as first-class relational data

In every other AI platform, knowledge is embedded in prompts or stored as opaque embeddings. harness.os treats knowledge as first-class relational data in normalized PostgreSQL tables. You can run SQL analytics on the knowledge itself: "Which harnesses on this mesh have the most learnings? Which rules are referenced most?" This works across all meshes because the methodology defines the schema.

Cross-mesh compound data effects

When a way2fly mesh produces a learning, and that learning generalizes (high transferability_score), it becomes available to other meshes following the same methodology. This isn't RAG -- it's structured relational data across isolated branches with explicit cross-mesh query patterns. The data compounds because it's structured, not because it's embedded.

Data Risks -- Acknowledged

Risk: "Branch-per-harness doesn't scale"

Neon branches are copy-on-write with shared storage. 100 cortex.ai tenant meshes share most storage cost. Marginal cost per mesh drops, not rises.

Risk: "No proven compound learning metrics yet"

True. The mesh_events and harness_budgets tables are in the schema but the metrics pipeline isn't built yet. This is the next major data engineering initiative -- greenfield work on a novel three-layer architecture.

Strategic deep dive: Competitive differentiation

What's Genuinely Differentiated

The three-layer business model

The methodology is IP (teachable, licensable). Configs are strategies (forkable, portable). Meshes are running businesses (scalable, independent). No other AI platform has this separation. CrewAI, LangChain, and AutoGen are toolkits -- they don't separate universal principles from specific implementations from running instances.

Configs spawn meshes spawn value

cortex.ai demonstrates this: one config (operations + domain for B2B), multiple meshes (Lake Deck = hospitality, Aluminex = manufacturing). Each mesh is independent but shares learnings through the methodology. More meshes = more compound intelligence = more value per mesh. The three-layer model makes this scaling pattern natural.

Strategic Risks -- Acknowledged

Risk: "The economics thesis is unproven"

The first 2-3 cortex.ai tenant meshes will prove or break it. If the second mesh takes less than 50% of the first mesh's setup effort, the thesis holds.

Risk: "Too complex to pitch"

The 60-second version: "AI agents that learn from every task and share that knowledge across all our products. A hotel's AI learned something about onboarding? Now the factory's AI is better at onboarding too." Lead with the outcome, not the three layers.

The Recruiting Pitch

For a Data Engineer

"I've built a three-layer AI knowledge platform. You'd own the mesh data architecture: branch strategy, schema evolution, cross-mesh queries, learning accumulation metrics. Greenfield data engineering on a novel architecture."

For a Product Director

"I've built a methodology that ships a new AI product config in days. I need someone to turn configs into businesses -- prioritize which meshes to launch, define go-to-market, run customer discovery."

For a Software Engineer

"I've built an AI orchestration mesh that's architecturally different from anything in the market. Three-layer separation, persistent knowledge, branch-level isolation. Real systems engineering, not prompt wrangling."

For a Partner

"I've built the methodology and the factory. You bring the industry. We create a config, spawn a mesh, and your customers get AI-powered operations. You own the customer relationship. Knowledge flows both ways."

What Makes This Different

💡

Most AI tools are like hiring a consultant who forgets everything after each meeting. You explain your business, they help, they leave, and next time you start from scratch. harness.os is like building a team that remembers everything -- every insight, every decision, every lesson learned. And when you bring in a new team member (add a new app to the mesh), they start with everything the team already knows.

🏭

Real example: A hotel (Lake Deck) started using cortex.ai -- their own running system (mesh). The AI learned what makes good hospitality onboarding. Then a manufacturing company (Aluminex) got their own mesh. Their onboarding AI was already better because of what the hotel taught it. That's the three-layer compound effect: the playbook (methodology) enables multiple game plans (configs), each running a system (mesh) that makes every other system smarter.

Honest About What's Not Done Yet

It's early

The platform works and powers real products. But it's been built by one person. The architecture (three layers, four knowledge types) is solid. The team needs to grow.

The business model is being proven

Two real businesses are using cortex.ai -- each running their own mesh. The theory is that each new mesh makes the platform cheaper and smarter. The trend so far is positive.

Why honesty is a strength

This platform shows you everything -- the playbook, the game plan, the running system, what's built, what's not. That transparency means anyone who joins knows exactly what impact they'll have.

Honest Assessment

Why this section exists

Every framework sounds brilliant when it's only described by its creator. This section deliberately separates what's validated (proven, working, real), what's genuinely novel (no one else is doing this specific thing), and what's unproven (thesis only, needs more evidence). Read this before making any commitment.

What's Validated — It Works

The architecture runs real products

6 apps connected through the harness mesh. build.ai orchestrates agents. cortex.ai serves 2 real business tenants (Lake Deck, Aluminex). way2fly, way2move, way2save are functional consumer apps. This isn't a whiteboard idea — it's deployed code with real data flowing through it.

Evidence: Running Neon databases, 37-table schema, MCP server handling 27+ tools per harness, real sessions with real output.

MCP as the mesh protocol

MCP (Model Context Protocol) has become the industry standard for AI tool integration. 97M+ monthly SDK downloads. Adopted by OpenAI, Google, Microsoft, AWS, and every major AI lab. The bet on MCP as the mesh communication layer was correct — it's now the de facto protocol.

Evidence: MCP Streamable HTTP transport is production-ready. Major IDE integrations ship with MCP support. The ecosystem is growing faster than GraphQL did at the same stage.

Knowledge persistence across sessions

The CNS schema (knowledge tables, learning tables, rules, workflows) demonstrably makes agents better across sessions. Agents don't start blank — they start with accumulated knowledge. This is observable: session quality improves as harness knowledge grows.

Evidence: Every Claude Code session in this ecosystem starts with harness context. The difference between a first session and a 50th session on the same harness is dramatic.

The four harness types are natural

Build (creation), Product (management), Operations (domain ops), Domain (user data) — these four categories emerged organically from real development, not from theory. Every piece of knowledge encountered across 6 apps fits cleanly into one of these four types. No artificial forcing required.

Evidence: 18 harness instances across 10 Neon branches, all cleanly typed. No "miscellaneous" category needed.

Outer harness > inner harness

The insight that the full intelligence layer (outer harness — knowledge AND process) matters more than the thin connector (inner harness) has been validated repeatedly. Claude Code, Copilot, custom API agents — all perform dramatically better when connected to the same outer harness. The intelligence survives model changes, tool changes, even complete agent rewrites.

Evidence: Same harness.os-mcp server used by CLI-spawned, API-built, and third-party agents. Knowledge persists regardless of which model powers the agent.

Process improvement is the real game

Software exists to improve processes. AI is a new participant in that improvement. Structuring knowledge so AI can participate effectively in process improvement — this framing maps to decades of BPM (Business Process Management) literature. It's not a new idea, but applying it specifically to AI knowledge organization is timely and valid.

Evidence: BPM is a $16B+ market. Process mining (Celonis, etc.) proves companies pay for structured process improvement. AI participation is the obvious next step.

What's Genuinely Novel — No One Else Is Doing This

Three-layer separation (methodology / config / mesh)

No existing AI platform separates universal principles from specific implementations from running instances. CrewAI, LangChain, AutoGen, Dify — they're all single-layer: code is the config is the runtime. harness.os explicitly separates what's universal (the methodology), what's a specific strategy (the config), and what's running (the mesh). This is the core architectural innovation.

Closest parallel: Kubernetes separates specification (YAML) from runtime (cluster), but has no methodology layer. Docker Compose is similar. The three-layer idea extends this to AI knowledge.

Four-category type system for AI knowledge

The four-category decomposition (how to create, why + what to create, how to run domain operations, per-user data) emerged from practice building 7 products in 6 weeks, not from theory. It's the structure I use daily — not a claim that it's the only way to organize AI knowledge.

Risk: It may be too reductive. Some knowledge may not fit cleanly. But 18 instances across 6 apps haven't surfaced a fifth type yet.

Configs as portable AI strategies

The concept that your AI development workflow, architecture decisions, and domain knowledge can be packaged as a "config" — forkable, shareable, versionable, separate from both the principles that guide it and the runtime that executes it — doesn't exist in any current platform. cortex.ai's onboarding funnel is literally "create a new config, spawn a mesh."

Potential: If configs become a marketplace, this is the product-market fit moment.

Scale tiers with methodology invariance

The explicit design that the same methodology works at every scale — from local files ($0) to federated enterprise mesh ($2K+/mo) — with only the implementation changing, not the principles. Most frameworks either target small teams or enterprises. harness.os claims to be both, with clear tier transitions and triggers for when to move up.

What's Unproven — Thesis Only, Needs Evidence

Compound learning across meshes

The thesis that learnings from one mesh make other meshes smarter — and that this compounds over time — is the biggest unproven claim. The mesh_events and harness_budgets tables are in the schema, but the metrics pipeline doesn't exist yet. There's no data showing learning transfer rates, no measurement of compound effects.

What would validate it: Measure time-to-productive for the 3rd cortex.ai tenant vs the 1st. If the 3rd takes <50% of the 1st's setup time, the thesis holds.

The economics at scale

The claim that marginal cost per mesh drops while value per mesh grows — the compound economics thesis — has only 2 data points (Lake Deck + Aluminex). That's not enough to prove a trend. Infrastructure costs, support burden, and knowledge curation effort at 50+ tenants are completely unknown.

What would validate it: 5-10 cortex.ai tenants with tracked per-tenant cost/revenue. If the marginal cost curve bends down, it's real.

Methodology portability to other people

harness.os has been developed and used by one person. The methodology has never been applied by someone else independently. The biggest risk: what seems "natural" and "clean" to the creator may be opaque to others. The four types, the CNS schema, the session lifecycle — all of this needs to survive contact with a second user.

What would validate it: One external developer follows the methodology without hand-holding. If they produce a working config and mesh, the methodology is real. If they can't, it's just one person's system.

The "AI Knowledge Engineering" category claim

The hypothesis that harness.os represents a specialization of what is emerging as "AI Knowledge Engineering" — specifically that it contributes a process categorization framework to the practice — is early-stage. The practice is being named independently by different players: KPMG calls it "knowledge engineering", Anthropic calls it "context engineering", Martin Fowler calls it "harness engineering". The convergence validates the need; the vocabulary hasn't settled yet. This claim should be held lightly.

Honest take: It's more likely that harness.os contributes ideas to the emerging practice than that it defines the category. That's still valuable — but frame it modestly.

37-table schema appropriateness

A 37-table schema for a single-user platform with 2 tenants may be over-engineered. The schema was designed for a future that hasn't arrived. If growth doesn't materialize, this is technical debt, not foresight. An engineer reviewing this would ask: "Do you use all 37 tables actively, or are half of them aspirational?"

Counter-argument: The schema defines the methodology's data model. Removing tables would mean removing methodology concepts. But some concepts may not earn their keep.

The Honest Scorecard

Claim Status Evidence
Persistent knowledge makes agents betterValidatedObservable in every session. Measurable quality difference.
Four harness types cover all AI knowledgeValidated18 instances, 6 apps, no fifth type needed.
Three-layer separation is novelValidatedNo competitor has this. Verified against CrewAI, LangChain, AutoGen, Dify.
MCP as mesh protocolValidatedIndustry standard. 97M+ monthly downloads. Universal adoption.
Outer harness outlives inner harnessValidatedSame knowledge, different agents. Proven across 3 agent types.
Configs are portable and forkableNovelDemonstrated via cortex.ai tenants. Not yet tested externally.
Compound learning across meshesUnprovenSchema exists. Pipeline not built. Zero measurements.
Economics improve with scaleUnproven2 data points. Positive trend. Not statistically meaningful.
Methodology works for other peopleUnprovenSingle creator. No external validation yet.
This is "AI engineering"SpeculativeCategory doesn't exist yet. May contribute ideas, unlikely to define it.

Research & Landscape

What exists in the market, what's adjacent, and where harness.os fits — based on actual research, not just claims.

Industry Validation: Others See the Same Problem

KPMG — "Knowledge Engineering for AI"

KPMG has built an entire consulting practice around "knowledge engineering" — structuring organizational knowledge so AI agents can use it effectively. Their thesis: the companies that structure their knowledge best will get the most value from AI. This directly validates harness.os's core insight.

The difference: KPMG sells consulting hours to do this manually for enterprises. harness.os is a methodology I created for myself, with a schema that makes the structuring systematic and repeatable. I'm sharing it because the approach might work for others too.

What this means: The problem is real and large enough for a Big 4 firm to build a practice around it. harness.os's approach (structured methodology instead of consulting) is differentiated but unproven at enterprise scale.

MCP Ecosystem — Protocol Validation

MCP (Model Context Protocol) has exploded since its launch. Key numbers:

  • 97M+ monthly SDK downloads (npm + pip combined)
  • Adopted by OpenAI, Google DeepMind, Microsoft, AWS, Meta
  • Streamable HTTP transport now production-ready (replaces SSE)
  • Every major IDE (VS Code, JetBrains, Cursor) supports MCP natively
  • Growing faster than GraphQL at same stage of lifecycle

What this means: Building on MCP was the right bet. The protocol will be around for years. harness.os's mesh can connect to any MCP-compatible client — which is becoming everything.

BPM & Process Mining — $16B+ Market

Business Process Management and process mining (Celonis, UiPath, Bizagi) represent a massive existing market built on the exact premise that structuring and improving processes creates business value. harness.os's "process improvement" framing isn't new — it's proven.

What IS new: applying it specifically to AI agent knowledge. Traditional BPM structures workflows for humans. harness.os structures knowledge so AI agents can participate in process improvement.

What this means: The market validates the problem and willingness to pay. harness.os extends the concept into AI — a natural evolution that BPM vendors will likely pursue too.

Memento-Skills — Persistent Agent Learning

Research into persistent agent memory (Memento-type approaches, skills-based agent learning) shows the AI research community is converging on the same insight: agents need persistent, structured knowledge to improve over time. Not just conversation memory — structured learnings that transfer across contexts.

What this means: harness.os's learning tables and transferability scores are aligned with where the field is heading. The specific implementation (relational tables with transferability_score) may be ahead of the research.

Detailed Competitive Landscape

Platform What It Does Overlap with harness.os What harness.os Does Differently Threat Level
CrewAI Multi-agent orchestration. Role-based agents with task delegation and sequential/parallel workflows. Agent roles, task decomposition, multi-agent coordination. Similar to harness.os's pipeline + phase system. No persistent knowledge. No methodology layer. No type system. Agents start blank every run. Code IS the config. Medium — could add persistence. But would need to invent the methodology layer.
LangChain / LangGraph LLM framework. Chains, agents, memory, tools, RAG. LangGraph adds graph-based workflows. Tool integration, memory, workflow orchestration. RAG is a form of knowledge retrieval. Toolkit, not methodology. Knowledge in code, not composable. Memory per-conversation, not per-domain. No type system. Medium — massive community. Could absorb similar ideas. But it's a toolkit philosophy, not a methodology.
AutoGen (Microsoft) Multi-agent conversation framework. Agents talk to each other to solve tasks collaboratively. Multi-agent collaboration, role specialization. Research-oriented, conversation-first. No persistent learning. No config/mesh distinction. No multi-product composition. Low — different paradigm (conversation vs knowledge mesh). Microsoft could pivot, but hasn't.
Dify / FlowiseAI Visual AI workflow builders. Drag-and-drop pipeline design. RAG, agents, tools. Pipeline design, tool integration, knowledge bases. Single-tenant, single-app. No knowledge mesh. No cross-domain reasoning. No compound learning. Visual-first vs methodology-first. Low — targets different user (no-code AI builders vs systems architects).
Notion AI / Guru / Glean Knowledge management with AI. Search across company docs. AI-assisted writing and Q&A. Knowledge structuring, cross-domain search, persistent knowledge stores. These are knowledge RETRIEVAL. harness.os is knowledge STRUCTURE + REASONING + LEARNING. Documents vs relational schema. Read-only vs read-write-learn. Low — different layer. Could be data sources TO a harness, not competitors.
Celonis / Process Mining Extract process patterns from system logs. Visualize, optimize, automate business processes. Process improvement, learning from execution, continuous optimization. Process mining is retrospective analysis. harness.os is prospective: agents USE knowledge during execution, LEARN from it, and IMPROVE future executions. Different time orientation. Medium — if they add AI agents that learn and execute, they'd have budget + customers + data. Watch this space.
harness.os Three-layer AI knowledge platform. Methodology + portable configs + running mesh instances with persistent learning. The combination: typed knowledge + persistent learning + three-layer separation + scale-invariant methodology. No one does all four. Self — the biggest threat is not building fast enough.

What Could Kill This

AI models get so good they don't need knowledge

If future models have perfect memory, perfect reasoning, and perfect context windows — the outer harness becomes less valuable. Context windows have grown from 4K to 1M+ tokens in 2 years.

Counter: Even with unlimited context, structured knowledge outperforms raw context. A database query is always faster than "find this in 1M tokens." And domain-specific LEARNED knowledge doesn't exist in training data.

A well-funded competitor builds this better

CrewAI ($18M Series A) or a new startup could build the three-layer model with a real team, better DX, and marketing budget. The methodology isn't patentable.

Counter: The methodology's value is in the accumulated knowledge and configurations, not the code. First-mover advantage in knowledge accumulation IS the moat. But only if you move fast enough.

It's too complex to explain

Three layers, four types, configs, meshes, branches, CNS schemas... If you can't explain it in 60 seconds, most people won't try it. Complexity killed many good frameworks.

Counter: Kubernetes is complex too. The 60-second pitch works: "AI agents that remember everything and share knowledge across apps." Lead with outcome, not architecture.

Solo developer can't maintain 6 apps + methodology

One person building 6 consumer apps, a SaaS platform, a personal assistant, AND a methodology framework. Something will break. The question is what breaks first and whether it matters.

Counter: The three-layer model is the answer. Products are compositions — they share the same config and methodology. But the counter needs to be proven at the app quality level.

Engineer's Take

What a senior engineer reviewing this system would validate, what they'd question, and what they'd want to see next.

What They'd Validate

"The architecture choices are sound"

PostgreSQL + MCP + Neon branching is a defensible stack. Not exotic, not over-engineered at the infrastructure level. Any senior backend engineer can understand and contribute to this immediately. The hexagonal architecture in build.ai follows well-established patterns.

"The type system makes intuitive sense"

Build / Product / Operations / Domain maps cleanly to how real organizations work. An engineer hearing this for the first time would nod, not argue. The renaming from "process" to "operations" was the right call — it eliminates the genus-as-species confusion.

"The session lifecycle is well-designed"

Start session → get context → work → log decisions/learnings → end session with handoff. This is clean, stateless, and composable. It works for CLI-spawned agents, API agents, and human users. The lifecycle is the methodology's strongest implementation detail.

"The inner/outer separation is the right abstraction"

Decoupling knowledge (outer) from execution (inner) means you're not locked into any model or agent framework. This is good engineering — it's the same principle as separating data from presentation. Any engineer who's survived a framework migration would appreciate this.

What They'd Question

"Do you actively use all 37 tables?"

A senior engineer would immediately grep for which tables have recent writes. If half are empty or aspirational, the schema is over-designed. The right answer is honest: some tables are load-bearing (knowledge_chunks, rules, learnings, sessions), some are placeholders for future features (mesh_events, harness_budgets). Acknowledge which is which.

"Is this a methodology or a personal workflow?"

The hardest question. A methodology should be teachable, repeatable, and produce similar results for different practitioners. harness.os has only been used by its creator. Until someone else follows it independently and succeeds, it's technically a personal system — a very well-organized one, but still personal.

"Where are the metrics?"

Claims about "compound learning," "agents getting better," "knowledge accumulating" — engineers want graphs, not adjectives. Time-to-first-useful-output per session over time. Learning count per harness per month. Agent success rate by harness maturity. Without these, the compound claims are anecdotal.

"What's the DX like?"

Developer experience. Setting up a new harness from scratch — how long does it take? Is it documented? Are there CLI tools? Templates? Or do you need to understand the whole methodology first? The gap between "I get it conceptually" and "I can do it myself" is where frameworks die.

"What happens when MCP changes?"

MCP is young and evolving rapidly. Streamable HTTP replaced SSE. The protocol could change again. How much of the mesh depends on MCP specifics vs generic tool-calling? The answer (thin MCP layer over standard Postgres) is good, but the migration path should be clearer.

"Is three layers genuinely different from any other layered architecture?"

A skeptical architect would say: "Every mature system has principles, configs, and runtime. You've just named them." The answer needs to be sharper than "we separated them" — it needs to show what becomes POSSIBLE because of the separation that ISN'T possible without it. Configs as forkable AI strategies is the strongest example.

What They'd Want to See Next

To believe the methodology:

  • A "Getting Started" guide someone else can follow
  • A second independent user successfully creating a config + mesh
  • Documentation that doesn't require reading this entire document first
  • Clear answer to "why not just use CLAUDE.md files?"

To believe the economics:

  • 5+ cortex.ai tenants with tracked setup-time per tenant
  • Cost-per-mesh declining curve with real data
  • A dashboard showing compound learning metrics
  • Evidence that cross-mesh learning produces measurably better outcomes

To believe the architecture:

  • Table usage audit (which of the 37 tables have >0 rows)
  • Latency numbers for harness context retrieval
  • Branch scaling test (what happens with 50 Neon branches?)
  • Clear migration path if MCP evolves incompatibly

To join the team:

  • Evidence of user traction beyond the creator
  • A clear 12-month roadmap with milestones
  • Understanding of which pieces to build vs buy vs partner
  • Honest assessment of what one person can realistically ship

The Bottom Line

harness.os is architecturally sound, genuinely novel in its three-layer separation, and aligned with where the industry is heading (knowledge engineering, persistent agent learning, MCP as protocol). The four-type knowledge system is clean and practical. The core insight — outer harness matters more than inner harness — is correct and defensible.

The biggest risks are execution speed (one person, six apps), methodology portability (untested by others), and compound economics (unproven at scale). These are all solvable problems — but they need to be solved with evidence, not claims. Build the metrics pipeline. Get a second user. Ship one app to high quality rather than six to prototype quality.

The question isn't whether the ideas are good — they are. The question is whether one person can execute on them fast enough before the market catches up. The three-layer model is the answer to that question too: it's designed to make one person effective. Now prove it.