Architecture overview

Derived from the honeycomb knowledge base, captured 2026-06. Written for an external practitioner. Confirm any version-specific detail against your installed version.

#The concept

honeycomb gives your AI coding assistants one shared, lasting memory. A small program (the daemon) runs on your machine, watches what happens as you work across the assistants you already use, distills it into clean notes, and serves those notes back to any assistant that asks. The design problem honeycomb solves is that coding assistants share almost nothing at their integration layer, yet a real memory system needs a single place to run its pipeline, its knowledge graph, and its maintenance loops. honeycomb answers both: write the memory logic once inside a daemon, then wrap it per assistant with thin shims that are clients of that daemon.

#The four planes

honeycomb is daemon-centric. Everything points at the daemon, and only the daemon points at storage.

flowchart TB
    subgraph surfaces[Surfaces]
        cli[CLI]
        dash[Dashboard]
        cursorExt[Cursor extension]
    end
    subgraph integrations[Assistant integrations]
        connectors[Connectors, install-time]
        hooks[Lifecycle hooks]
        mcp[MCP server]
        sdk[SDK]
    end
    subgraph runtime[honeycomb daemon, port 3850]
        capture[Capture intake]
        pipeline[Pipeline]
        retrieval[Hybrid retrieval and browse]
        ontology[Knowledge graph]
        pollinating[Maintenance loop]
        router[Model and provider router]
        workers[Skillify, summaries, codebase graph]
    end
    store[(Storage: GPU-backed SQL plus vector)]

    cli --> runtime
    dash --> runtime
    cursorExt --> integrations
    connectors --> runtime
    hooks --> runtime
    mcp --> runtime
    sdk --> runtime
    capture --> store
    pipeline --> store
    retrieval --> store

Surfaces are how a person drives honeycomb: the CLI, the local dashboard, and the Cursor extension.
Integrations are how external assistants reach it: install-time connectors, lifecycle hooks, the MCP server, and the SDK.
The daemon is the runtime where all logic lives, on port 3850 by default.
Storage is the substrate: a GPU-backed SQL and vector store where all durable state lives, isolated by organization and workspace.

#The shape of the loop

Capture, distill, recall, compound. An assistant hook captures every prompt, tool call, and response as a raw event. The daemon's pipeline distills those events into facts, entities, and skills, each with provenance back to the source. Recall serves the right context before the next turn through hybrid search and a browsable virtual filesystem. Over time, a maintenance loop and a skill miner consolidate what was learned, so memory gets sharper instead of noisier.

#The daemon as the only storage client

The single most important property of the architecture is that no process other than the daemon talks to storage. Hooks, the CLI, the SDK, and MCP tools assemble a request, hand it to the daemon over a local loopback connection, and render the response. They never open a storage connection themselves. This collapses the storage-facing surface to one process, which is where scoping, SQL construction and escaping, encryption, and schema healing all live. Adding a new assistant means writing a new thin shim, not a new engine; fixing the engine means editing the daemon, and every assistant inherits the fix.

#Surfaces to reach the daemon

A consuming assistant or application can reach the daemon four ways, all thin clients:

Surface	Used by	Nature
Connectors	Install time	Patch the assistant's config, write hook handlers, link skills, register MCP. Run once, never at session time.
Lifecycle hooks	Every session	Map native lifecycle events to capture and recall calls on the daemon.
MCP server	MCP-speaking assistants	On-demand tool surface in the assistant's native tool list.
SDK	Applications and custom agents	A typed HTTP client (`@honeycomb/sdk`) over the daemon API.

#Contracts that keep the planes apart

Three contracts hold the system together:

One storage client. The daemon is the only process with a storage handle. A compromised or buggy client cannot reach storage directly or cross a tenant boundary, because the daemon re-derives scope from the validated token on every request.
One active runtime path per session. A session can be reachable through more than one integration surface. The first path to touch a session claims it; a request from the other path on that session returns 409. Stale claims expire and are swept, so a crashed assistant never locks a session forever.
Three-level tenancy. Organization, then workspace, then project. Organization and workspace isolation is enforced at the storage layer, so two workspaces never share a row, partition, or index. Project is the soft inner ring that scopes recall to the repository the agent is working in without ever dropping a capture. Within a workspace, an agent_id and a visibility setting separate agents.

#Getting in

Onboarding is one command. The installer detects and sets up a Node runtime, installs the global package, brings the daemon up, and lands you on the dashboard, with sign-in driven from the dashboard rather than the terminal. The daemon does not need credentials to boot; it serves a guided-setup state until you sign in, then serves the authenticated views on the next request with no restart. See the getting started guide and the CLI reference.

#Where to read next

Capture and memory: how a raw event becomes a structured fact.
Recall and retrieval: how the right context is found and shaped.
The knowledge graph: the ontology and the codebase graph.
Harness integrations: how honeycomb plugs underneath six assistants.
Data and storage: the table catalog and the storage patterns.
Security model: trust boundaries, scoping, secrets, and telemetry.