figma-connect

A local Figma MCP server, and a pixel-level oracle that tells an AI agent whether the code it wrote actually matches the design

RoleSolo: design & build

TypeDeveloper tool / MCP server

StackTypeScript · MCP · Playwright · SQLite

Shape4-package pnpm monorepo

How verification works → Architecture

01Why I built it

I build a lot of UI from Figma files, often with an AI agent doing the first pass. The agent reads the design, writes the markup, and it looks roughly right. "Roughly right" is the problem. The spacing is four pixels off, a font weight is wrong, a gradient stop drifted, and nobody notices until it's in review, because there's nothing checking the output against the source. The existing Figma tooling hands you JSON about a node. None of it tells you whether what you shipped looks like what the designer drew.

So I built two things into one tool. First, full-fidelity read access to a live Figma file from inside an AI coding agent: geometry, auto-layout, fills, strokes, effects, type, tokens, components. Second, and the part that actually matters: a verification oracle. It takes the code the agent generated, renders it in a real browser, and pixel-diffs it against the live Figma node. The agent stops guessing whether it got the design right and gets a pass or fail with the diff attached.

It runs entirely on my machine. Browser Figma works, with no Desktop app, no cloud REST API, no design data leaving the laptop. A plugin runs inside Figma, a local bridge does the indexing and the talking, and the whole thing speaks MCP so any compatible agent can drive it.

35MCP tools

~15kLOC TypeScript

4workspace packages

3diff channels per check

02How it fits together

Four packages in a pnpm workspace, each with one job. The plugin lives in Figma and is the only thing that can read the document. The bridge is the brain: it runs as a background daemon that holds the cache, runs the search, and exposes the tools. Claude Code talks to it through a small MCP shim over stdio, so a file stays indexed across sessions instead of being re-walked every time the agent restarts. The harness is the renderer and judge. A shared package carries the protocol types so the plugin and bridge can't drift out of sync.

Figma (browser) + plugin

@figma-connect/plugin · reads the live document

WebSocket RPC · 127.0.0.1:9911

Bridge daemon: index, cache, serve

@figma-connect/bridge · runs detached · SQLite + FTS5 search · delta-sync on edits

MCP over stdio · via shim

AI coding agent

queries the design, writes code, then asks the bridge to verify it

code + target node → render & diff

Harness: render & judge

@figma-connect/harness · headless Chromium · pixel + SSIM + a11y diff

Security is layered, and I'm honest in the code about where the line sits. The bridge binds to 127.0.0.1 only. Nothing it does is reachable from the network, and that's the real gate. On top of that it checks the WebSocket origin against an allowlist: Figma's web origin, plus the empty/null origin that sandboxed plugin iframes send. Allowing null is a deliberate, commented tradeoff: the loopback binding is what actually keeps a stray page out, so the origin check is defence-in-depth, not the sole lock.

03verify_node: the part I care about

This is the reason the project exists. verify_node takes a node ID and the candidate code, mounts the code in headless Chromium with Playwright, exports the matching node from Figma, and compares them across three channels at once:

Pixel diff

pixelmatch over the aligned bitmaps, catching the four-pixel shifts, the wrong radius, the gradient that drifted. The blunt instrument, and the one that finds the most.

Structural similarity

SSIM, because a raw pixel count over-punishes a one-pixel global offset and under-punishes a small but wrong region. SSIM weights perceived structure, so the score tracks what a human would call "close."

Text & a11y oracle

axe-core plus a text check, so a render that looks fine but dropped a label, mangled casing, or lost a heading still fails. Looking right isn't the same as being right.

A diff you can read

Every run returns a labeled EXPECTED | ACTUAL | DIFF image. The agent doesn't get a bare number it can rationalize away. It gets the picture of exactly what's off.

The point is to turn "looks right" into a gate. Before this, an agent reproducing a design would declare victory and move on. Now there's a number and an image standing between "done" and done, and the agent has to actually pass it. re_verify re-runs a check after the cache has caught a design edit, and reports whether it had to resync, so a stale pass can't masquerade as a fresh one.

04Indexing a Figma file that's too big to read

A real Figma file is enormous. You can't shove it into a model's context and you can't re-walk it on every question. So the bridge indexes the file once into a local SQLite database (nodes, images, styles, semantics) and answers from there. Search is backed by an FTS5 virtual table, which makes lexical lookups land in well under 50ms even on a large file.

Two honest constraints fall out of that design, and I document both rather than pretend they don't exist. The search is lexical, not semantic: it matches tokens that literally appear in a node's name, text, or styles, so a concept query won't surface a generically-named layer. And the digest the agent reads is a budgeted view of the node: it can't carry every property at full precision without blowing the context window, so I made deliberate calls about what fidelity to keep and what to flag as lossy. When a design changes, a delta-sync pass invalidates only the nodes that moved instead of rebuilding the whole cache.

-- the bridge's cache, in one glance
files            -- one row per indexed Figma file
nodes            -- the node tree + serialized IR
images           -- exported fills / screenshots, content-addressed
node_semantics   -- derived role/relationship hints
change_log       -- what moved, for delta-sync
fts_nodes        -- FTS5 virtual table → sub-50ms lexical search

SQLite schema (better-sqlite3). One index pass, then everything answers locally.

05The problems that ate the most time

The headline features were the easy part. The work that actually mattered was making the render-and-diff loop trustworthy, because a verification tool that fails for the wrong reasons is worse than none. It teaches you to ignore it.

Fonts not ready at capture

Screenshotting before web fonts finished loading produced false diffs against the design. The fix was gating capture on font-ready, so text is measured as rendered, not as it flashed.

networkidle that never idles

Waiting on network-idle hung on pages with long-poll or streaming connections. I replaced the blanket wait with explicit readiness signals so verification can't stall forever.

Browser-pool concurrency

Running many verifications in parallel needs pooled Chromium contexts that don't step on each other or leak. Getting the pooling right is what makes batch verification usable instead of flaky.

Fidelity vs. token budget

Every property the digest carries costs context. I added explicit fidelity flags for gradients, shadows, strokes, opacity and masks so the agent knows when a value is exact and when it's approximate.

Most of the recent history on the project is exactly this kind of work: reliability fixes, additive database migrations, and a limitations document I keep current on purpose. I'd rather ship a tool that says "I can't represent this accurately yet" than one that quietly lies.

06The tool surface

35 MCP tools, grouped by what they're for. Most are read-only by design: the server inspects a design, it doesn't mutate it.

Read & navigate

Document overview, node IR at any depth, outline, fetch-by-URL, current selection, and two ways to search: exact name lookup and FTS5-backed relevance.

Styles & tokens

Resolved styles for a node, the document's design tokens, and a token map so generated code can reference variables instead of hardcoded values.

Components

Component map, instance-to-main resolution, variant sets, and usage lookups, so the agent reuses the system instead of rebuilding primitives.

Assets & screenshots

Image fills and node screenshots, in-memory or written straight to disk for the render step.

Workspace & coverage

prepare_workspace to index a file, page_manifest for a deterministic build work-list, check_coverage, and list_changes_since for iterative sessions.

Verify

verify_node and re_verify: the render-and-diff oracle and its resync-aware re-run.

07Tech

TypeScriptWhole stack, Node ESM

Model Context ProtocolAgent interface (stdio)

pnpm workspaces4-package monorepo

better-sqlite3Local cache

SQLite FTS5Lexical search

wsPlugin ↔ bridge transport

PlaywrightHeadless render

pixelmatchPixel diff

image-ssimStructural diff

axe-coreText / a11y oracle

ViteMounts React/Vue/Svelte candidates

zodTool schemas

esbuildPlugin bundle

vitestTests across packages

← BackAll Projects Next →Suzuki Connect RE