217 lines
7.9 KiB
Text
217 lines
7.9 KiB
Text
Research Synthesis - Editor-Agnostic CLI Collaboration
|
|
|
|
THE CORE PROBLEM:
|
|
|
|
Zed and VSCode have beautiful real-time collaboration. But they lock you into their editors. If you're a vim/helix/kakoune user and want to pair program with a friend, you shouldn't have to make them switch editors. The goal: divorce collaborative editing from any specific editor.
|
|
|
|
EXISTING APPROACHES ANALYZED:
|
|
|
|
1. Terminal Multiplexing (upterm, tmate, tmux sharing)
|
|
|
|
How it works: Share a PTY over the network. Everyone sees the same terminal output, keystrokes forwarded to the shell.
|
|
|
|
Upterm specifically: Reverse SSH tunnel to a central server, clients connect through it. MultiWriter pattern broadcasts output to all connected clients.
|
|
|
|
Pros: Works TODAY with any CLI editor. Zero editor integration needed. Good for "let me show you something" pair programming.
|
|
|
|
Cons: No concurrent editing (everyone's typing goes to same shell). No offline. No semantic awareness. Last keystroke wins. Not true collaborative editing.
|
|
|
|
Verdict: Great for terminal screenshare, not for document collaboration.
|
|
|
|
2. File-Level Sync (VSCode LiveShare style)
|
|
|
|
How it works: Host owns the workspace. Guests get proxied file access. SSH protocol with relay fallback.
|
|
|
|
Not actually CRDT-based - more like remote desktop for code.
|
|
|
|
Sessions expire after 24 hours. P2P when possible, Microsoft relay otherwise.
|
|
|
|
Verdict: Doesn't solve editor-agnostic problem. Guests are still locked to host's environment.
|
|
|
|
3. CRDT-Based Document Sync (Zed, instant.nvim)
|
|
|
|
How it works: Each character gets a unique ID. Operations are "insert after ID xyz" not "insert at position 5". Concurrent edits automatically merge correctly.
|
|
|
|
Zed's architecture: Anchors (logical positions), tombstone deletions, Lamport timestamps, version vectors, per-user undo maps. Server for auth/discovery, CRDT for document state.
|
|
|
|
instant.nvim: Pure Lua implementation for Neovim. WebSocket server routes messages. Position IDs (tombstone vector clocks) for conflict-free ordering.
|
|
|
|
Key insight from instant.nvim: 70% of the code is editor-agnostic (transport + CRDT algorithm). Only 30% is neovim-specific (buffer events, manipulation, cursor display).
|
|
|
|
THE PROPOSED ARCHITECTURE:
|
|
|
|
CRDT Daemon + Thin Editor Adapters
|
|
|
|
The daemon handles all the hard parts:
|
|
- CRDT text buffer (using cola or diamond-types)
|
|
- Network sync (WebSocket for remote, Unix socket for local)
|
|
- Session management
|
|
- Peer discovery/auth
|
|
|
|
Each editor gets a minimal adapter that:
|
|
1. Hooks into buffer change events
|
|
2. Serializes changes as (offset, length, text)
|
|
3. Sends to daemon
|
|
4. Receives remote operations from daemon
|
|
5. Applies changes to local buffer
|
|
6. Optionally: displays peer cursors
|
|
|
|
Why this split works:
|
|
- Solving CRDT correctly is hard. Do it once in the daemon.
|
|
- Each editor's adapter is simple. Just event hooks and buffer manipulation.
|
|
- Adding new editors is cheap. Write a small plugin, done.
|
|
- Multiple different editors can collaborate simultaneously.
|
|
|
|
THE EDITOR ADAPTER REQUIREMENTS:
|
|
|
|
For any CLI editor to participate, the adapter needs:
|
|
|
|
1. Change event hook - Know when user edits the buffer
|
|
- Neovim: nvim_buf_attach with on_lines callback
|
|
- Helix: LSP-based or custom events
|
|
- Kakoune: FIFO-based extension system
|
|
- Vim: +clientserver or plugin
|
|
|
|
2. Buffer manipulation - Apply remote changes
|
|
- Neovim: nvim_buf_set_lines
|
|
- Others: Similar APIs exist
|
|
|
|
3. Cursor visualization (optional but nice) - Show where peers are editing
|
|
- Neovim: nvim_buf_set_extmark with virtual text
|
|
- Others: Editor-specific
|
|
|
|
THE LSP ANGLE:
|
|
|
|
Many CLI editors already speak LSP (Language Server Protocol). This is interesting because:
|
|
|
|
- textDocument/didChange already notifies of edits
|
|
- textDocument/didOpen and didClose handle lifecycle
|
|
- workspace/executeCommand can carry custom operations
|
|
|
|
A "collaboration language server" could:
|
|
1. Receive didChange notifications
|
|
2. Run them through CRDT
|
|
3. Push remote changes back via workspace edits
|
|
|
|
This would reduce per-editor work to almost zero - editors already have LSP clients. Worth exploring.
|
|
|
|
CRDT LIBRARY CHOICE:
|
|
|
|
Cola (https://github.com/nomad/cola):
|
|
- Operation-based CRDT for text
|
|
- Buffer-agnostic: doesn't store text, just manages coordinates
|
|
- Clean API: Replica, Insertion, Deletion
|
|
- Real-time P2P focus
|
|
- Serialization via serde or custom encode
|
|
- Handles out-of-order delivery via backlog
|
|
- Benchmarks show 1.4-2x faster than diamond-types in some cases
|
|
|
|
Diamond-types (https://github.com/josephg/diamond-types):
|
|
- "World's fastest CRDT"
|
|
- 5000x-80000x speedup through aggressive RLE
|
|
- Stores full history (temporal DAG + spatial state)
|
|
- More complex (OpLog, Branch, CausalGraph concepts)
|
|
- Great for: large documents, offline-first, audit trails
|
|
- WASM support for browser
|
|
|
|
For our use case: Cola wins.
|
|
- Simpler API, easier to integrate
|
|
- Real-time focus matches our needs
|
|
- We don't need full history storage
|
|
- Less cognitive overhead to work with
|
|
|
|
Diamond-types is overkill for initial prototyping. Could revisit for optimization later.
|
|
|
|
COMMUNICATION PROTOCOL OPTIONS:
|
|
|
|
1. Unix socket - Simple, local only. Good for same-machine testing.
|
|
|
|
2. WebSocket - Works remote. Browser-friendly if we ever want web UI. Good default.
|
|
|
|
3. stdio pipe - Simplest for CLI tools. Editor spawns daemon, communicates via stdin/stdout.
|
|
|
|
4. LSP protocol - Leverage existing infrastructure. Interesting but might be awkward fit.
|
|
|
|
Recommendation: WebSocket as primary (works local and remote), Unix socket as fast local alternative.
|
|
|
|
REFERENCE IMPLEMENTATIONS:
|
|
|
|
repos/cola/
|
|
- src/replica.rs: Main API, 1200+ lines of docs
|
|
- src/insertion.rs, deletion.rs: Operation types
|
|
- examples/basic.rs: Simple Document wrapper pattern
|
|
- Key pattern: editor maintains buffer + Replica, calls inserted/deleted for local ops, integrate_* for remote ops
|
|
|
|
repos/instant.nvim/
|
|
- lua/instant.lua: Main logic, mixed nvim + algorithm
|
|
- lua/instant/websocket_*.lua: Transport layer (portable)
|
|
- Position ID generation (genPID): Tombstone vector clocks
|
|
- Shows exactly what adapters need to do
|
|
|
|
repos/upterm/
|
|
- host/host.go: Session lifecycle
|
|
- io/writer.go: MultiWriter for output broadcast
|
|
- Different paradigm but useful for understanding terminal collaboration UX
|
|
|
|
repos/diamond-types/
|
|
- Complex internals, good for understanding CRDT optimization
|
|
- INTERNALS.md, BINARY.md explain the RLE approach
|
|
|
|
NEXT STEPS TO PROTOTYPE:
|
|
|
|
Phase 1: Minimal daemon
|
|
- Rust binary using cola
|
|
- Single document support
|
|
- WebSocket server
|
|
- Two clients can connect, edits sync
|
|
|
|
Phase 2: Neovim adapter
|
|
- Lua plugin
|
|
- Connects to daemon via WebSocket
|
|
- Hooks nvim_buf_attach for changes
|
|
- Applies remote changes via nvim_buf_set_lines
|
|
- Test: two neovim instances editing same file
|
|
|
|
Phase 3: Multi-document
|
|
- Session management
|
|
- File path mapping
|
|
- Join/leave notifications
|
|
|
|
Phase 4: Second editor
|
|
- Helix adapter (or kakoune, or vim)
|
|
- Prove the architecture works across editors
|
|
|
|
Phase 5: Polish
|
|
- Peer cursors
|
|
- User presence indicators
|
|
- Better auth (SSH keys, GitHub)
|
|
- Discovery service
|
|
|
|
OPEN QUESTIONS:
|
|
|
|
1. Where does the daemon run?
|
|
- Local daemon per machine? Central server? Hybrid?
|
|
- For local-first: daemon on each machine, P2P sync
|
|
- For easy setup: central server handles routing
|
|
|
|
2. How to handle file paths?
|
|
- Relative to project root? Absolute? UUID-based?
|
|
- Need consistent naming across different machines
|
|
|
|
3. Undo/redo coordination?
|
|
- Per-user undo (like Zed) or global?
|
|
- Cola doesn't handle this - need to build on top
|
|
|
|
4. Cursor/selection sync?
|
|
- Nice to have, not essential for MVP
|
|
- Adds complexity (need to track peer positions)
|
|
|
|
5. Permissions?
|
|
- Can anyone edit anything? Read-only viewers?
|
|
- Future concern, not MVP
|
|
|
|
THE DREAM:
|
|
|
|
You're in helix. Friend is in neovim. Another friend is in kakoune. You all open the same project, connect to a session, and just... edit together. Changes flow seamlessly. Each person uses their preferred editor with their preferred config. No one had to install anything they don't normally use.
|
|
|
|
That's the goal.
|