Reference · page 2 / 6

2. Architecture

Part 2 of 6. ← Overview · Index · Next → Build walkthrough

The flow

User types in ChatGPT
        │
        ▼
ChatGPT model decides to call a tool ── (MCP tools/list, tools/call over HTTP)
        │
        ▼
Your MCP server (HTTPS, Streamable HTTP or SSE transport)
        │  returns:
        │   - structuredContent (JSON the model sees)
        │   - content (optional markdown narration)
        │   - _meta.ui.resourceUri (pointer to a UI template you registered)
        │   - _meta (extra payload that ONLY the widget sees, never the model)
        ▼
ChatGPT loads the HTML/JS bundle in a sandboxed iframe
        │  rendered under <yourdomain>.web-sandbox.oaiusercontent.com
        ▼
Widget talks back via JSON-RPC over postMessage ("MCP Apps bridge")
   - reads toolInput / toolOutput / widgetState
   - can call more tools (tools/call), post follow-up messages (ui/message),
     update model context (ui/update-model-context), request fullscreen, etc.

Three things to internalise

1. MCP is the wire format; Apps SDK is the extension

The protocol is open (modelcontextprotocol.io). Apps SDK layers on:

_meta.ui.* fields (widget resource URI, CSP, domain)
A text/html;profile=mcp-app MIME type
The window.openai bridge (JSON-RPC over postMessage between the iframe and ChatGPT)

Everything else — tool definitions, resource registration, transport — is vanilla MCP.

2. Two payloads from every tool call

Each tool response carries two separate payloads that go to different readers:

Field	Who sees it	What to put there
`structuredContent`	The model (next turn, counts against context)	Task-relevant JSON only: IDs, titles, statuses
`content`	The model (as narration)	Short markdown the model can echo or reason over
`_meta`	The widget (via `window.openai`)	Large UI-only data: image URLs, full transcripts, rich render data

Leaking diagnostics or full payloads into structuredContent is one of the most common causes of review rejection and unnecessary token burn.

3. Transport

Streamable HTTP is the recommended transport today.
SSE still works.
Whichever you pick must be reachable on HTTPS by ChatGPT.

The MCP Inspector's --transport http / --transport sse flags must match your server's choice.

Runtime surfaces

Tool handler (server-side, Node/Python). Privileged work. Holds auth tokens. Validates arguments. Returns both payloads.
Widget (iframe, browser). Renders UI, reads window.openai.toolOutput and window.openai.toolInput, can call more tools via window.openai.callTool(...).
ChatGPT host. Routes tool calls, passes bearer tokens, enforces CSP, mediates between widget ↔ server.

Never blur these:

Widget ≠ privileged. It runs in the user's browser. Don't put secrets there.
Server ≠ UI code. Don't try to stream DOM to ChatGPT; return resource URIs.
Model-visible data ≠ UI data. Keep them separate or the model's context gets polluted.