Architecture¶
Loom is single-binary by design. The web UI, the API gateway, the MCP runtime, and the CLI all live in one repository and build into one process.
┌────────────────────────────────────────┐
│ Browser (React) │
│ ┌──────────┐ ┌───────────────────┐ │
│ │ Admin UI │ │ Playground (chat, │ │
│ │ /logs │ │ embedding, …) │ │
│ └────┬─────┘ └──────┬────────────┘ │
└───────┼───────────────┼────────────────┘
│ TanStack Query│
┌───────▼───────────────▼────────────────┐
│ Next.js Route Handlers │
│ (one process; node runtime) │
│ │
│ /api/v1/* OpenAI-compatible gateway │
│ /api/* CRUD + Playground backend │
│ /api/mcp/* MCP server registry │
└───────────────┬────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ SQLite │ │ Upstream │ │ MCP │
│ (WAL mode) │ │ providers │ │ servers │
└────────────┘ └────────────┘ └────────────┘
conversations OpenAI / Azure / stdio / HTTP
messages Foundry / vLLM / … transports
generation_logs
users / api_keys
providers / models
mcp_servers
Stack¶
- Next.js 16 App Router + React 19 + TypeScript strict for the UI and API
- SQLite (WAL) via Drizzle ORM + better-sqlite3 for storage; migrations run automatically on boot
- citty for the CLI subcommand router
- @clack/prompts for the interactive
loom initwizard - TanStack Query + Zustand for client state
- shadcn/ui + Tailwind for the visual layer
There is no separate backend service. There is no Redis. There is no Postgres. State lives in one SQLite file; copy that file to move your installation.
Layering¶
| Layer | Lives in | Responsibility |
|---|---|---|
| HTTP / SSE | app/api/**/route.ts |
OpenAI-compatible gateway, admin CRUD, Playground BE |
| Services | lib/server/<domain>/ |
Business logic, one folder per domain |
| Adapters | lib/server/adapters/ |
Per-protocol-variant URL / header / field rules |
| Capabilities | lib/server/capabilities/ |
One file per modality (chat, embed, rerank, image, …) |
| Schemas | lib/schemas/*.ts |
Zod wire types — single source of truth |
| MCP runtime | lib/server/mcp/ |
Client pool, tool routing, transport lifecycle |
| Storage | lib/server/db/ |
SQLite + WAL + auto-migrations on boot |
| CLI | bin/loom.ts + lib/cli/ |
Entry shim + citty subcommand tree |
| Web UI | app/(dashboard)/** + components/** |
shadcn/ui + TanStack Query |
Adapter layer¶
The gateway core never branches on provider type. Instead, each upstream protocol variant is one adapter:
registerAdapter({
id: "azure-foundry",
matches: (provider) => provider.base_url.includes("foundry"),
selectUpstreamApi: (modality) => modality === "chat"
? "/v1/chat/completions"
: "/v1/...",
transformRequest: (body, modality) => stripUnsupportedFields(body),
acceptedFields: { /* whitelist */ },
});
A new protocol = one new file in lib/server/adapters/<id>.ts + one line of registration. The gateway, capability handlers, and admin UI dropdown all pick it up automatically.
Capability layer¶
Each modality (chat / embed / rerank / image / speech / transcribe) is one file in lib/server/capabilities/. Adding a new modality is:
- One new file with
registerCapability(...) - One Route Handler that calls
forwardGeneration(user, "<id>", body)
The gateway's core forwarding logic never changes.
MCP runtime¶
lib/server/mcp/runtime.ts owns one process-wide pool of MCP clients. On first use of a server, the runtime spawns / connects the transport, caches the discovered tool / resource / prompt catalog, and registers eviction handlers (transport close → drop pool entry, SIGINT/SIGTERM → graceful shutdown of every stdio child).
Tool dispatch is a registry lookup: <server-prefix>__<tool> → MCP server → tools/call.
Streaming pipeline¶
For streaming generations (/api/v1/chat/completions?stream=true):
upstream SSE bytes ─┬─▶ tee to client (verbatim, byte-for-byte)
└─▶ tee to log writer
- assemble full response from chunks
- measure TTFT (first byte)
- measure total latency (last byte)
- persist on stream close
The log writer never blocks the client tee — if logging fails, the response still streams correctly.
Single-process trade-off¶
Loom keeps the playground, the gateway, the MCP runtime, and the request log in one Node process backed by one SQLite file. This is a deliberate trade-off:
- Operational simplicity — no separate observability service, no Redis, no Postgres, no worker fleet. Back up
data/loom.dband you back up the entire installation. - Auth consistency — the same API keys apply across the gateway and the playground; the same log row records both.
- Scale ceiling — single-process means single-host. Loom targets teams and self-hosted deployments rather than multi-region SaaS workloads. When you outgrow it, your gateway-side traffic is already speaking the OpenAI protocol, so swapping in a horizontal proxy in front is a config change, not a rewrite.