Research Synthesis - Editor-Agnostic CLI Collaboration THE CORE PROBLEM: Zed and VSCode have beautiful real-time collaboration. But they lock you into their editors. If you're a vim/helix/kakoune user and want to pair program with a friend, you shouldn't have to make them switch editors. The goal: divorce collaborative editing from any specific editor. EXISTING APPROACHES ANALYZED: 1. Terminal Multiplexing (upterm, tmate, tmux sharing) How it works: Share a PTY over the network. Everyone sees the same terminal output, keystrokes forwarded to the shell. Upterm specifically: Reverse SSH tunnel to a central server, clients connect through it. MultiWriter pattern broadcasts output to all connected clients. Pros: Works TODAY with any CLI editor. Zero editor integration needed. Good for "let me show you something" pair programming. Cons: No concurrent editing (everyone's typing goes to same shell). No offline. No semantic awareness. Last keystroke wins. Not true collaborative editing. Verdict: Great for terminal screenshare, not for document collaboration. 2. File-Level Sync (VSCode LiveShare style) How it works: Host owns the workspace. Guests get proxied file access. SSH protocol with relay fallback. Not actually CRDT-based - more like remote desktop for code. Sessions expire after 24 hours. P2P when possible, Microsoft relay otherwise. Verdict: Doesn't solve editor-agnostic problem. Guests are still locked to host's environment. 3. CRDT-Based Document Sync (Zed, instant.nvim) How it works: Each character gets a unique ID. Operations are "insert after ID xyz" not "insert at position 5". Concurrent edits automatically merge correctly. Zed's architecture: Anchors (logical positions), tombstone deletions, Lamport timestamps, version vectors, per-user undo maps. Server for auth/discovery, CRDT for document state. instant.nvim: Pure Lua implementation for Neovim. WebSocket server routes messages. Position IDs (tombstone vector clocks) for conflict-free ordering. Key insight from instant.nvim: 70% of the code is editor-agnostic (transport + CRDT algorithm). Only 30% is neovim-specific (buffer events, manipulation, cursor display). THE PROPOSED ARCHITECTURE: CRDT Daemon + Thin Editor Adapters The daemon handles all the hard parts: - CRDT text buffer (using cola or diamond-types) - Network sync (WebSocket for remote, Unix socket for local) - Session management - Peer discovery/auth Each editor gets a minimal adapter that: 1. Hooks into buffer change events 2. Serializes changes as (offset, length, text) 3. Sends to daemon 4. Receives remote operations from daemon 5. Applies changes to local buffer 6. Optionally: displays peer cursors Why this split works: - Solving CRDT correctly is hard. Do it once in the daemon. - Each editor's adapter is simple. Just event hooks and buffer manipulation. - Adding new editors is cheap. Write a small plugin, done. - Multiple different editors can collaborate simultaneously. THE EDITOR ADAPTER REQUIREMENTS: For any CLI editor to participate, the adapter needs: 1. Change event hook - Know when user edits the buffer - Neovim: nvim_buf_attach with on_lines callback - Helix: LSP-based or custom events - Kakoune: FIFO-based extension system - Vim: +clientserver or plugin 2. Buffer manipulation - Apply remote changes - Neovim: nvim_buf_set_lines - Others: Similar APIs exist 3. Cursor visualization (optional but nice) - Show where peers are editing - Neovim: nvim_buf_set_extmark with virtual text - Others: Editor-specific THE LSP ANGLE: Many CLI editors already speak LSP (Language Server Protocol). This is interesting because: - textDocument/didChange already notifies of edits - textDocument/didOpen and didClose handle lifecycle - workspace/executeCommand can carry custom operations A "collaboration language server" could: 1. Receive didChange notifications 2. Run them through CRDT 3. Push remote changes back via workspace edits This would reduce per-editor work to almost zero - editors already have LSP clients. Worth exploring. CRDT LIBRARY CHOICE: Cola (https://github.com/nomad/cola): - Operation-based CRDT for text - Buffer-agnostic: doesn't store text, just manages coordinates - Clean API: Replica, Insertion, Deletion - Real-time P2P focus - Serialization via serde or custom encode - Handles out-of-order delivery via backlog - Benchmarks show 1.4-2x faster than diamond-types in some cases Diamond-types (https://github.com/josephg/diamond-types): - "World's fastest CRDT" - 5000x-80000x speedup through aggressive RLE - Stores full history (temporal DAG + spatial state) - More complex (OpLog, Branch, CausalGraph concepts) - Great for: large documents, offline-first, audit trails - WASM support for browser For our use case: Cola wins. - Simpler API, easier to integrate - Real-time focus matches our needs - We don't need full history storage - Less cognitive overhead to work with Diamond-types is overkill for initial prototyping. Could revisit for optimization later. COMMUNICATION PROTOCOL OPTIONS: 1. Unix socket - Simple, local only. Good for same-machine testing. 2. WebSocket - Works remote. Browser-friendly if we ever want web UI. Good default. 3. stdio pipe - Simplest for CLI tools. Editor spawns daemon, communicates via stdin/stdout. 4. LSP protocol - Leverage existing infrastructure. Interesting but might be awkward fit. Recommendation: WebSocket as primary (works local and remote), Unix socket as fast local alternative. REFERENCE IMPLEMENTATIONS: repos/cola/ - src/replica.rs: Main API, 1200+ lines of docs - src/insertion.rs, deletion.rs: Operation types - examples/basic.rs: Simple Document wrapper pattern - Key pattern: editor maintains buffer + Replica, calls inserted/deleted for local ops, integrate_* for remote ops repos/instant.nvim/ - lua/instant.lua: Main logic, mixed nvim + algorithm - lua/instant/websocket_*.lua: Transport layer (portable) - Position ID generation (genPID): Tombstone vector clocks - Shows exactly what adapters need to do repos/upterm/ - host/host.go: Session lifecycle - io/writer.go: MultiWriter for output broadcast - Different paradigm but useful for understanding terminal collaboration UX repos/diamond-types/ - Complex internals, good for understanding CRDT optimization - INTERNALS.md, BINARY.md explain the RLE approach NEXT STEPS TO PROTOTYPE: Phase 1: Minimal daemon - Rust binary using cola - Single document support - WebSocket server - Two clients can connect, edits sync Phase 2: Neovim adapter - Lua plugin - Connects to daemon via WebSocket - Hooks nvim_buf_attach for changes - Applies remote changes via nvim_buf_set_lines - Test: two neovim instances editing same file Phase 3: Multi-document - Session management - File path mapping - Join/leave notifications Phase 4: Second editor - Helix adapter (or kakoune, or vim) - Prove the architecture works across editors Phase 5: Polish - Peer cursors - User presence indicators - Better auth (SSH keys, GitHub) - Discovery service OPEN QUESTIONS: 1. Where does the daemon run? - Local daemon per machine? Central server? Hybrid? - For local-first: daemon on each machine, P2P sync - For easy setup: central server handles routing 2. How to handle file paths? - Relative to project root? Absolute? UUID-based? - Need consistent naming across different machines 3. Undo/redo coordination? - Per-user undo (like Zed) or global? - Cola doesn't handle this - need to build on top 4. Cursor/selection sync? - Nice to have, not essential for MVP - Adds complexity (need to track peer positions) 5. Permissions? - Can anyone edit anything? Read-only viewers? - Future concern, not MVP THE DREAM: You're in helix. Friend is in neovim. Another friend is in kakoune. You all open the same project, connect to a session, and just... edit together. Changes flow seamlessly. Each person uses their preferred editor with their preferred config. No one had to install anything they don't normally use. That's the goal.