Add terminal emulation design doc

This commit is contained in:
Jared Miller 2026-01-31 09:40:31 -05:00
parent a14decf2bc
commit 5016cd9960
Signed by: shmup
GPG key ID: 22B5C6D66A38B06C

836
docs/terminal-emulation.md Normal file
View file

@ -0,0 +1,836 @@
# Terminal Emulation Design
## Problem Statement
### User-Visible Symptoms
When a dashboard client reconnects to an active Claude Code session or when the server restarts, the terminal output is garbled. The browser shows fragments like just "T" characters repeatedly, or incomplete output. Resizing the terminal temporarily fixes the problem because it triggers Claude Code to redraw the entire screen.
### Technical Root Cause
clarc currently processes ANSI escape sequences in a stateless manner:
1. **No terminal state tracking** - Each chunk of ANSI output is processed independently without maintaining cursor position, screen buffer, or terminal attributes across chunks
2. **CSI final bytes leaking** - The ansiCarryover system detects incomplete sequences like `ESC [ params` but doesn't handle the final command byte (T, H, J, etc.), allowing these to leak through as literal characters
3. **Incorrect CR handling** - Carriage returns (`\r`) truncate to the last newline instead of properly moving the cursor to column 0, losing overwritten content
4. **No reconnect state** - When a client reconnects, they receive nothing or stale data because there's no terminal state to serialize and send
The core issue: we're treating a stateful protocol (terminal emulation) as stateless string processing.
### Why This Matters
Terminal applications like Claude Code use cursor positioning, line clearing, and character overwrites extensively. Without proper state tracking:
- Progress indicators get garbled
- Interactive prompts break
- Reconnecting clients see corrupted output
- Screen redraws don't work correctly
## Solution Overview
### High-Level Architecture Change
Replace the current stateless ANSI processing pipeline with a proper terminal emulator using `@xterm/headless` + `@xterm/addon-serialize`:
**Current (broken):**
```
PTY output → splitAnsiCarryover() → ansiToHtml() → Browser
appendOutput(DB)
```
**New (correct):**
```
PTY output → Terminal.write() → Terminal state (buffer, cursor, attrs)
↓ ↓
appendOutput(DB) SerializeAddon.serializeAsHTML()
Browser
```
### Why @xterm/headless
`@xterm/headless` is the official xterm.js headless terminal emulator designed for exactly this use case:
- **Proper VT emulation** - Handles all ANSI/VT sequences correctly (CSI, OSC, cursor movement, scrollback)
- **Maintains state** - Tracks cursor position, screen buffer, attributes, scrollback history
- **Serialization support** - `@xterm/addon-serialize` can export terminal state as HTML or ANSI
- **Battle-tested** - Used by VS Code, Hyper, and other major projects
- **Server-friendly** - No DOM dependencies, runs in Node.js/Bun
This is exactly what crabigator does with the `vt100` crate in Rust.
### Comparison: Current vs New
| Aspect | Current | New |
|--------|---------|-----|
| ANSI processing | Stateless SGR-only parser | Full VT emulator |
| Cursor tracking | None | Full cursor position state |
| Screen buffer | Raw ANSI chunks | Complete screen buffer |
| Reconnect | Sends nothing or stale data | Serializes current screen state |
| CR/LF handling | Buggy truncation | Proper cursor movement |
| CSI sequences | Strips params, leaks final bytes | Full CSI command handling |
| Dependencies | Custom ansi.ts | @xterm/headless + addon-serialize |
## Architecture Changes
### New Data Flow
```
┌─────────────┐
│ CLI (PTY) │
└──────┬──────┘
│ WebSocket: {type:"output", data}
┌─────────────────────────────────────────────┐
│ Server (server.ts) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Terminal Emulator (per session) │ │
│ │ - Terminal (headless) │ │
│ │ - SerializeAddon │ │
│ │ - Screen buffer + cursor state │ │
│ └────────┬──────────────────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ appendOutput(DB) serializeAsHTML() │
│ (raw ANSI) (rendered state) │
│ │ │ │
└───────────┼──────────────────┼──────────────┘
│ │
▼ ▼
Database SSE: {type:"output"}
┌───────────────┐
│ Dashboard │
│ (innerHTML) │
└───────────────┘
```
### What Gets Added
1. **Terminal emulator per session** (new `src/terminal.ts`)
- Create `Terminal` instance when session starts
- Create `SerializeAddon` instance
- Store in `sessionTerminals` Map
2. **Dependencies** (package.json)
- `@xterm/headless` - Terminal emulator
- `@xterm/addon-serialize` - HTML/ANSI serialization
### What Gets Changed
1. **server.ts**
- Add `sessionTerminals` Map alongside `sessionWebSockets`
- In WebSocket `message` handler for `type:"output"`:
- Remove `splitAnsiCarryover()` call
- Remove `ansiCarryovers` Map usage
- Call `terminal.write(msg.data)` instead
- Call `serializeAddon.serializeAsHTML()` for SSE broadcast
- In WebSocket `message` handler for `type:"auth"`:
- Create new Terminal + SerializeAddon
- Send initial serialized state to client
- In WebSocket `close` handler:
- Dispose terminal instance
- Remove from `sessionTerminals` Map
2. **ansi.ts**
- Keep `ansiToHtml()` for backward compatibility during migration
- Mark as deprecated
- Eventually remove after Phase 4
3. **db.ts**
- Keep storing raw ANSI in `output_log` for now
- Later: consider storing screen snapshots instead
### What Gets Removed
1. **ansi-carryover.ts** - No longer needed with proper emulator
2. **ansiCarryovers Map** - Terminal emulator handles buffering
3. **splitAnsiCarryover() calls** - Terminal emulator handles partial sequences
### Storage Changes
For now: keep storing raw ANSI chunks in `output_log` table. This preserves backward compatibility and allows us to rebuild terminal state if needed.
Future consideration: switch to storing periodic screen snapshots instead of raw ANSI. This would improve reconnect performance for long-running sessions.
## Implementation Plan
### Phase 1: Add @xterm/headless and Basic Integration
**Goal:** Feed PTY output into terminal emulator, no visible changes yet.
**Files to change:**
- `package.json` - Add dependencies
- `src/terminal.ts` (new) - Terminal manager module
- `src/server.ts` - Create terminals on session start
**Code patterns:**
```typescript
// src/terminal.ts
import { Terminal } from "@xterm/headless";
import { SerializeAddon } from "@xterm/addon-serialize";
export interface TerminalSession {
terminal: Terminal;
serialize: SerializeAddon;
}
export function createTerminal(cols: number, rows: number): TerminalSession {
const terminal = new Terminal({
cols,
rows,
allowProposedApi: true, // Required for some addons
});
const serialize = new SerializeAddon();
terminal.loadAddon(serialize);
return { terminal, serialize };
}
export function disposeTerminal(session: TerminalSession): void {
session.serialize.dispose();
session.terminal.dispose();
}
```
```typescript
// src/server.ts
import { createTerminal, disposeTerminal, type TerminalSession } from "./terminal";
const sessionTerminals = new Map<number, TerminalSession>();
// In WebSocket message handler for "auth":
const { cols, rows } = getTerminalSize(); // Get from initial resize or default
const termSession = createTerminal(cols, rows);
sessionTerminals.set(session.id, termSession);
// In WebSocket message handler for "output":
const termSession = sessionTerminals.get(sessionId);
if (termSession) {
termSession.terminal.write(msg.data);
}
// In WebSocket close handler:
const termSession = sessionTerminals.get(ws.data.sessionId);
if (termSession) {
disposeTerminal(termSession);
sessionTerminals.delete(ws.data.sessionId);
}
```
**Tests:**
- Unit test: `terminal.test.ts` - Create terminal, write data, verify buffer exists
- Integration test: Start session, send output, verify terminal contains data
### Phase 2: Replace ansiToHtml with Terminal Serialization
**Goal:** Use terminal serialization for all output rendering.
**Files to change:**
- `src/server.ts` - Replace `ansiToHtml()` with `serializeAsHTML()`
- `src/terminal.ts` - Add serialization helpers
**Code patterns:**
```typescript
// src/terminal.ts
export function serializeAsHTML(session: TerminalSession): string {
return session.serialize.serializeAsHTML({
excludeAltBuffer: false,
excludeModes: false,
onlySelection: false,
});
}
```
```typescript
// src/server.ts - WebSocket message handler for "output"
if (msg.type === "output") {
const sessionId = ws.data.sessionId;
const termSession = sessionTerminals.get(sessionId);
if (!termSession) {
console.error(`No terminal for session ${sessionId}`);
return;
}
// Write to terminal emulator
termSession.terminal.write(msg.data);
// Store raw ANSI in DB
appendOutput(sessionId, msg.data);
// Broadcast serialized HTML to dashboards
broadcastSSE({
type: "output",
session_id: sessionId,
data: serializeAsHTML(termSession), // Changed from ansiToHtml(body)
});
return;
}
```
**Tests:**
- Unit test: Write ANSI sequences, verify HTML output is correct
- Integration test: Send cursor movement sequences, verify final HTML shows correct result
- Regression test: Verify CR handling works (overwrites instead of truncates)
### Phase 3: Add Reconnect State Sync
**Goal:** Send terminal state to clients on reconnect.
**Files to change:**
- `src/server.ts` - Send initial state on SSE connect
- `src/types.ts` - Add new SSE event type
- `public/index.html` or dashboard - Handle initial state event
**Code patterns:**
```typescript
// src/types.ts - Add new SSE event type
export type SSEEvent =
| { type: "initial_state"; session_id: number; html: string }
| ... // existing events
// src/server.ts - SSE endpoint
if (url.pathname === "/events") {
let ctrl: ReadableStreamDefaultController<string>;
const stream = new ReadableStream<string>({
start(controller) {
ctrl = controller;
sseClients.add(controller);
// Send initial headers
controller.enqueue(": connected\n\n");
// Send current state for all active sessions
for (const [sessionId, termSession] of sessionTerminals.entries()) {
const html = serializeAsHTML(termSession);
const event: SSEEvent = {
type: "initial_state",
session_id: sessionId,
html,
};
const eventStr = `event: ${event.type}\ndata: ${JSON.stringify(event)}\n\n`;
controller.enqueue(eventStr);
}
},
cancel() {
sseClients.delete(ctrl);
},
});
return new Response(stream, { headers: { ... } });
}
```
**Dashboard handling:**
```typescript
// public/index.html (or frontend.tsx if using React)
eventSource.addEventListener("initial_state", (e) => {
const data = JSON.parse(e.data);
const sessionEl = document.querySelector(`[data-session="${data.session_id}"]`);
if (sessionEl) {
sessionEl.innerHTML = data.html; // Replace entire content
}
});
eventSource.addEventListener("output", (e) => {
const data = JSON.parse(e.data);
const sessionEl = document.querySelector(`[data-session="${data.session_id}"]`);
if (sessionEl) {
sessionEl.innerHTML += data.html; // Append incremental updates
}
});
```
**Tests:**
- Integration test: Start session with output, reconnect dashboard, verify state is sent
- Integration test: Verify "T" artifact bug is fixed (output doesn't corrupt on reconnect)
### Phase 4: Clean Up Old Code
**Goal:** Remove deprecated ANSI processing code.
**Files to change:**
- `src/ansi-carryover.ts` - Delete file
- `src/ansi.ts` - Delete file (or keep minimal version for other uses)
- `src/server.ts` - Remove ansiCarryovers Map and related code
- `src/server.ts` - Remove imports of deleted modules
**Tests:**
- Run full test suite to ensure nothing broke
- Verify bundle size reduction
## API Changes
### New SSE Event Type
```typescript
// types.ts
export type SSEEvent =
| { type: "initial_state"; session_id: number; html: string }
| ... // existing events
```
### No WebSocket Message Changes
The WebSocket protocol between CLI and server remains unchanged - CLI still sends `{type:"output", data}` chunks.
### Dashboard Receives Terminal State
Instead of receiving incremental ANSI-to-HTML chunks, dashboards now receive:
1. **On connect:** Full terminal state as HTML via `initial_state` event
2. **On updates:** Incremental updates as HTML via `output` event
## Testing Strategy
### Unit Tests
```typescript
// src/terminal.test.ts
import { test, expect } from "bun:test";
import { createTerminal, serializeAsHTML } from "./terminal";
test("creates terminal with correct dimensions", () => {
const term = createTerminal(80, 24);
expect(term.terminal.cols).toBe(80);
expect(term.terminal.rows).toBe(24);
});
test("writes data to terminal buffer", () => {
const term = createTerminal(80, 24);
term.terminal.write("Hello, world!");
const html = serializeAsHTML(term);
expect(html).toContain("Hello, world!");
});
test("handles ANSI cursor movement", () => {
const term = createTerminal(80, 24);
term.terminal.write("AAA\x1b[3D"); // Write AAA, move cursor back 3
term.terminal.write("BBB"); // Overwrite with BBB
const html = serializeAsHTML(term);
expect(html).toContain("BBB");
expect(html).not.toContain("AAA");
});
test("handles carriage return correctly", () => {
const term = createTerminal(80, 24);
term.terminal.write("Old text\rNew"); // CR should move to col 0
const html = serializeAsHTML(term);
expect(html).toContain("New");
expect(html).not.toContain("Old");
});
test("handles incomplete ANSI sequences across writes", () => {
const term = createTerminal(80, 24);
term.terminal.write("Hello\x1b["); // Incomplete CSI
term.terminal.write("31mRed\x1b[0m"); // Complete it
const html = serializeAsHTML(term);
expect(html).toContain("Red");
expect(html).toMatch(/color.*red/i); // Check for red styling
});
```
### Integration Tests
```typescript
// src/integration.test.ts
import { test, expect } from "bun:test";
import { WebSocket } from "ws";
test("reconnect sends initial terminal state", async () => {
// Start server, create session, send output
const ws1 = new WebSocket("ws://localhost:7200/ws");
await new Promise(resolve => ws1.once("open", resolve));
ws1.send(JSON.stringify({ type: "auth", secret: "test" }));
await new Promise(resolve => ws1.once("message", resolve)); // authenticated
ws1.send(JSON.stringify({ type: "output", data: "Test output\n" }));
await Bun.sleep(100);
// Reconnect with SSE, expect initial_state event
const sse = new EventSource("http://localhost:7200/events");
const events: any[] = [];
sse.addEventListener("initial_state", (e) => {
events.push(JSON.parse(e.data));
});
await Bun.sleep(100);
expect(events.length).toBeGreaterThan(0);
expect(events[0].html).toContain("Test output");
});
test("T artifact bug is fixed", async () => {
// Reproduce original bug: send CSI sequence, reconnect, verify no "T" leak
const ws = new WebSocket("ws://localhost:7200/ws");
await new Promise(resolve => ws.once("open", resolve));
ws.send(JSON.stringify({ type: "auth", secret: "test" }));
await new Promise(resolve => ws.once("message", resolve));
// Send cursor movement CSI (final byte T)
ws.send(JSON.stringify({ type: "output", data: "\x1b[5;10H" })); // CUP - cursor position
await Bun.sleep(100);
// Reconnect and check state
const sse = new EventSource("http://localhost:7200/events");
let html = "";
sse.addEventListener("initial_state", (e) => {
html = JSON.parse(e.data).html;
});
await Bun.sleep(100);
expect(html).not.toContain("T"); // Final byte should not leak
});
```
### How to Test "T" Artifact Fix
The original bug shows "T" characters because CSI final bytes leak through. To verify it's fixed:
1. Start a session
2. Send ANSI sequences with various final bytes (H, J, K, T, etc.)
3. Reconnect a dashboard client
4. Verify the HTML contains proper output, not the final bytes as literal characters
## Migration Path
### Incremental Rollout
Yes, we can do incremental rollout:
1. **Phase 1:** Add terminal emulator alongside existing code (no behavior change)
2. **Phase 2:** Switch to terminal serialization (behavior change, but backward compatible)
3. **Phase 3:** Add reconnect state (new feature, backward compatible)
4. **Phase 4:** Remove old code (cleanup, no API changes)
### Backwards Compatibility
**Database:** No schema changes required. We continue storing raw ANSI in `output_log`.
**WebSocket protocol:** No changes to CLI ↔ Server messages.
**SSE protocol:** Additive only - new `initial_state` event, existing `output` event structure unchanged.
**Dashboard:** Needs update to handle `initial_state` event, but can ignore it initially (degrades gracefully).
### Rollback Plan
If we need to rollback during migration:
**After Phase 1:** Just remove terminal creation, keep using ansiToHtml
**After Phase 2:** Revert serializeAsHTML calls back to ansiToHtml
**After Phase 3:** Remove initial_state event handling
**After Phase 4:** Cannot easily rollback (code deleted), but could revert entire commit
Safety: Keep git tags at each phase boundary for easy rollback.
## Code Examples
### Creating Headless Terminal Per Session
```typescript
// src/terminal.ts
import { Terminal } from "@xterm/headless";
import { SerializeAddon } from "@xterm/addon-serialize";
export interface TerminalSession {
terminal: Terminal;
serialize: SerializeAddon;
}
/**
* Create a new headless terminal emulator instance
* @param cols - Terminal width in columns
* @param rows - Terminal height in rows
* @returns Terminal session with emulator and serialization addon
*/
export function createTerminal(cols: number, rows: number): TerminalSession {
const terminal = new Terminal({
cols,
rows,
scrollback: 1000, // Keep 1000 lines of scrollback
allowProposedApi: true,
});
const serialize = new SerializeAddon();
terminal.loadAddon(serialize);
return { terminal, serialize };
}
/**
* Serialize terminal screen buffer as HTML
*/
export function serializeAsHTML(session: TerminalSession): string {
return session.serialize.serializeAsHTML({
excludeAltBuffer: false,
excludeModes: false,
onlySelection: false,
});
}
/**
* Clean up terminal resources
*/
export function disposeTerminal(session: TerminalSession): void {
session.serialize.dispose();
session.terminal.dispose();
}
```
### Feeding PTY Data Into Terminal
```typescript
// src/server.ts (WebSocket message handler)
// Map to store terminal emulators per session
const sessionTerminals = new Map<number, TerminalSession>();
// On session creation (auth message):
if (msg.type === "auth") {
// ... existing auth logic ...
// Create terminal emulator for this session
const termSession = createTerminal(80, 24); // Use actual terminal size
sessionTerminals.set(session.id, termSession);
// ... rest of auth handler ...
}
// On output message:
if (msg.type === "output") {
const sessionId = ws.data.sessionId;
const termSession = sessionTerminals.get(sessionId);
if (!termSession) {
console.error(`No terminal for session ${sessionId}`);
return;
}
// Write PTY output to terminal emulator
// Terminal handles all ANSI sequences, cursor movement, buffering, etc.
termSession.terminal.write(msg.data);
// Store raw ANSI in database (unchanged)
appendOutput(sessionId, msg.data);
// Serialize current terminal state as HTML
const html = serializeAsHTML(termSession);
// Broadcast to dashboards
broadcastSSE({
type: "output",
session_id: sessionId,
data: html,
});
return;
}
// On WebSocket close:
close(ws) {
if (ws.data.sessionId) {
// ... existing cleanup ...
// Dispose terminal emulator
const termSession = sessionTerminals.get(ws.data.sessionId);
if (termSession) {
disposeTerminal(termSession);
sessionTerminals.delete(ws.data.sessionId);
}
}
}
```
### Serializing State for Reconnect
```typescript
// src/server.ts (SSE endpoint)
if (url.pathname === "/events") {
let ctrl: ReadableStreamDefaultController<string>;
const stream = new ReadableStream<string>({
start(controller) {
ctrl = controller;
sseClients.add(controller);
// Send initial connection acknowledgment
controller.enqueue(": connected\n\n");
// Send current terminal state for all active sessions
for (const [sessionId, termSession] of sessionTerminals.entries()) {
const session = getSession(sessionId);
if (!session || session.ended_at) continue;
// Serialize full terminal state as HTML
const html = serializeAsHTML(termSession);
// Send as initial_state event
const event: SSEEvent = {
type: "initial_state",
session_id: sessionId,
html,
};
const eventStr = `event: ${event.type}\ndata: ${JSON.stringify(event)}\n\n`;
controller.enqueue(eventStr);
}
},
cancel() {
sseClients.delete(ctrl);
},
});
return new Response(stream, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
},
});
}
```
### Dashboard Receiving and Rendering State
```typescript
// public/index.html or frontend.tsx
// Connect to SSE endpoint
const eventSource = new EventSource("/events");
// Handle initial state (sent on connect)
eventSource.addEventListener("initial_state", (event) => {
const data = JSON.parse(event.data);
const sessionId = data.session_id;
const html = data.html;
// Find or create terminal display element
let terminalEl = document.querySelector(`[data-session="${sessionId}"]`);
if (!terminalEl) {
terminalEl = document.createElement("pre");
terminalEl.setAttribute("data-session", sessionId);
terminalEl.className = "terminal-output";
document.getElementById("terminals").appendChild(terminalEl);
}
// Replace entire content with current terminal state
terminalEl.innerHTML = html;
});
// Handle incremental output updates
eventSource.addEventListener("output", (event) => {
const data = JSON.parse(event.data);
const sessionId = data.session_id;
const html = data.html;
const terminalEl = document.querySelector(`[data-session="${sessionId}"]`);
if (!terminalEl) {
console.warn(`No terminal element for session ${sessionId}`);
return;
}
// For now, replace content (later: optimize to append)
// Note: Full replace is safest because terminal emulator handles all state
terminalEl.innerHTML = html;
});
// Styling for terminal output
const style = `
.terminal-output {
background: #0d1117;
color: #c9d1d9;
font-family: 'Courier New', monospace;
font-size: 14px;
padding: 10px;
overflow-x: auto;
white-space: pre;
}
`;
```
## Open Questions / Future Work
### Should We Keep Raw ANSI in DB or Switch to Screen Snapshots?
**Current:** Store raw ANSI chunks in `output_log` table.
**Pros:**
- Preserves full command history
- Can rebuild terminal state from any point
- Debugging-friendly (can see exact sequences)
**Cons:**
- Large storage for long sessions
- Reconnect requires replaying all chunks
- Not efficient for random access
**Alternative:** Store periodic screen snapshots (every N seconds or N bytes).
**Pros:**
- Fast reconnect (just load latest snapshot)
- Constant-size storage per session
- Efficient random access to session state
**Cons:**
- Loses command history between snapshots
- More complex migration path
- Need snapshot management (cleanup old ones)
**Recommendation:** Keep raw ANSI for now (Phase 1-4), evaluate snapshots later based on performance data.
### Scrollback Handling - How Much History?
Terminal emulator supports configurable scrollback. Questions:
1. **How many lines?** Currently set to 1000 in example. Is this enough for Claude Code sessions?
2. **Dashboard scrolling:** Do dashboards need to display scrollback or just current screen?
3. **Memory concerns:** Each terminal instance keeps scrollback in memory. For many concurrent sessions, this could add up.
**Recommendation:** Start with 1000 lines, monitor memory usage, make configurable via environment variable.
### Performance Considerations
**Serialization cost:** Calling `serializeAsHTML()` on every PTY output chunk could be expensive. Considerations:
1. **Throttling:** Only serialize every N milliseconds or N bytes
2. **Incremental updates:** Send diffs instead of full HTML (requires custom serialization)
3. **Caching:** Cache last serialization, only re-serialize if terminal changed
**Memory usage:** One Terminal instance per session. For 100 concurrent sessions:
- ~100 terminal buffers in memory
- Each buffer: ~80 cols × 24 rows × 1000 scrollback = ~2MB per session
- Total: ~200MB for 100 sessions (acceptable)
**Recommendation:** Start simple (serialize on every output), optimize later if needed. Add metrics to track serialization time.
### Terminal Size Tracking
Current code doesn't track initial terminal size properly. Need to:
1. Get terminal size from CLI on auth (add to auth message?)
2. Track resize events properly (update terminal.resize())
3. Handle missing/invalid sizes gracefully
**Recommendation:** Add `cols` and `rows` to auth message, default to 80×24 if missing.
### Alternative: Use @xterm/addon-fit for Auto-sizing?
The `@xterm/addon-fit` addon can auto-calculate terminal size based on container dimensions. But this requires a DOM, which we don't have server-side.
**Recommendation:** Keep manual size tracking, not applicable for headless.
### Can We Use This for Replay/Playback?
Having full terminal state opens up interesting possibilities:
1. **Session replay:** Store terminal state snapshots, replay session history
2. **Time travel debugging:** Jump to any point in session timeline
3. **Export to video:** Render terminal state as frames, create video
**Recommendation:** Out of scope for initial implementation, but good future feature.