377 lines
15 KiB
ReStructuredText
377 lines
15 KiB
ReStructuredText
=============================
|
|
zvm embedding audit — Z-machine interpreter feasibility
|
|
=============================
|
|
|
|
zvm is a Python Z-machine interpreter (by Ben Collins-Sussman) being evaluated
|
|
for embedding in mudlib to run interactive fiction games over telnet. this audit
|
|
covers architecture, isolation requirements, and modification paths. compared
|
|
against the viola audit for apples-to-apples decision-making.
|
|
|
|
|
|
1. global state — mostly clean, two leaks
|
|
==========================================
|
|
|
|
state is instance-based. ``ZMachine.__init__`` wires everything together::
|
|
|
|
ZMemory(story) -- memory
|
|
ZStringFactory(mem) -- string decoding
|
|
ZObjectParser(mem) -- object tree
|
|
ZStackManager(mem) -- call/data stacks
|
|
ZOpDecoder(mem, stack) -- instruction decoder
|
|
ZStreamManager(mem, ui) -- I/O streams
|
|
ZCpu(mem, opdecoder, stack, objects, string, streams, ui) -- CPU
|
|
|
|
dependency graph flows one way: ZCpu depends on everything else, everything
|
|
else depends on ZMemory. no circular dependencies. clean layering.
|
|
|
|
two global leaks:
|
|
|
|
- zlogging.py — executes at import time. opens debug.log and disasm.log in cwd,
|
|
sets root logger to DEBUG. all instances share the same loggers
|
|
- zcpu.py uses ``random.seed()`` / ``random.randint()`` — global PRNG state.
|
|
multiple instances would interfere with each other's randomness
|
|
|
|
both fixable with ~20 lines. the logging can be made instance-scoped, the PRNG
|
|
replaced with ``random.Random()`` instances.
|
|
|
|
compared to viola's 13/18 modules with mutable globals, this is dramatically
|
|
better. multiple ZMachine instances in one process is structurally possible.
|
|
|
|
|
|
2. IO boundary — excellent, purpose-built for embedding
|
|
========================================================
|
|
|
|
this is zvm's strongest feature. the README explicitly states the design goal:
|
|
"no user interface. meant to be used as the backend in other programs."
|
|
|
|
IO is fully abstracted via a four-component ``ZUI`` object (zui.py)::
|
|
|
|
ZUI(audio, screen, keyboard_input, filesystem)
|
|
|
|
each component is an abstract base class with NotImplementedError stubs:
|
|
|
|
- ZScreen (zscreen.py) — write(), split_window(), select_window(),
|
|
set_cursor_position(), erase_window(), set_text_style(), set_text_color()
|
|
- ZInputStream (zstream.py) — read_line(), read_char() with full signatures
|
|
for timed input, max length, terminating characters
|
|
- ZFilesystem (zfilesystem.py) — save_game(), restore_game(),
|
|
open_transcript_file_for_writing/reading()
|
|
- ZAudio (zaudio.py) — play_bleep(), play_sound_effect()
|
|
|
|
trivialzui.py is the reference stdio implementation showing how to subclass.
|
|
|
|
for MUD embedding: implement ZScreen.write() to push to telnet session,
|
|
ZInputStream.read_line() to receive from telnet reader, ZFilesystem to store
|
|
saves in SQLite. natural fit.
|
|
|
|
compared to viola where pygame is hardwired and you'd need to create a
|
|
TelnetInput class from scratch, zvm hands you the interface contract.
|
|
|
|
|
|
3. error paths — server-friendly
|
|
=================================
|
|
|
|
zero sys.exit() calls in the zvm/ package. only sys.exit() is in the CLI
|
|
runner run_story.py, which is appropriate.
|
|
|
|
clean exception hierarchy:
|
|
|
|
- ZMachineError (zmachine.py)
|
|
- ZCpuError -> ZCpuIllegalInstruction, ZCpuDivideByZero, ZCpuNotImplemented
|
|
- ZMemoryError -> ZMemoryIllegalWrite, ZMemoryOutOfBounds, ZMemoryBadMemoryLayout
|
|
- ZObjectError -> ZObjectIllegalObjectNumber, ZObjectIllegalAttributeNumber
|
|
- ZStackError -> ZStackNoRoutine, ZStackNoSuchVariable, ZStackPopError
|
|
- QuetzalError -> QuetzalMalformedChunk, QuetzalMismatchedFile
|
|
|
|
CPU run loop does not catch exceptions — errors propagate to the caller.
|
|
correct behavior for embedding.
|
|
|
|
one weakness: some opcodes have bare ``assert`` statements that would be
|
|
stripped with ``python -O``.
|
|
|
|
compared to viola's 8+ sys.exit() calls via error.fatal(), this is exactly
|
|
what you want for a server.
|
|
|
|
|
|
4. version coverage — V1-V5 declared, V3-V5 structural
|
|
========================================================
|
|
|
|
zmemory.py handles v1-v5 with version-switched code throughout. v6-v8 are
|
|
not supported (raises ZMemoryUnsupportedVersion).
|
|
|
|
opcode table (zcpu.py) has version annotations like ``(op_call_2s, 4)``
|
|
meaning "available from v4 onward". systematic and well-done.
|
|
|
|
v1-v3 object model uses 1-byte pointers (max 255 objects), v4-v5 uses
|
|
2-byte (max 65535). properly handled in zobjectparser.py.
|
|
|
|
test suite only uses curses.z5 — v1-v3 support exists structurally but has
|
|
untested code paths.
|
|
|
|
compared to viola's solid V1-V5 plus partial V6 and theoretical V7-V8,
|
|
zvm is narrower but the architecture is cleaner where it does exist.
|
|
|
|
|
|
5. input model — abstracted but unimplemented
|
|
===============================================
|
|
|
|
line input (read_line):
|
|
interface complete — full signature with original_text, max_length,
|
|
terminating_characters, timed_input_routine, timed_input_interval.
|
|
trivial implementation handles basic line editing, ignores timed input.
|
|
|
|
single char (read_char):
|
|
interface complete — signature includes timed_input_routine and
|
|
timed_input_interval. CPU explicitly raises ZCpuNotImplemented if
|
|
time/routine are nonzero.
|
|
|
|
timed input:
|
|
feature flag system exists (``features["has_timed_input"] = False``).
|
|
completely unimplemented at CPU level.
|
|
|
|
**critical problem**: op_sread (v1-v3) has an empty body — silently does
|
|
nothing. op_sread_v4 and op_aread (v5) both raise ZCpuNotImplemented.
|
|
the interpreter cannot accept text input. it cannot reach the first prompt
|
|
of any game.
|
|
|
|
compared to viola where input works and needs an adapter, zvm has the
|
|
better interface design but no working implementation behind it.
|
|
|
|
|
|
6. step execution mode — not available, easy refactor
|
|
======================================================
|
|
|
|
run loop in zcpu.py::
|
|
|
|
def run(self):
|
|
while True:
|
|
(opcode_class, opcode_number, operands) = self._opdecoder.get_next_instruction()
|
|
implemented, func = self._get_handler(opcode_class, opcode_number)
|
|
if not implemented:
|
|
break
|
|
func(self, *operands)
|
|
|
|
tight while-True with no yielding. no step() method, no async support, no
|
|
callbacks between instructions, no instruction count limit.
|
|
|
|
the loop body is clean and self-contained though. extracting step() is a ~5
|
|
line change — pull the body into a method, call it from run().
|
|
|
|
for async MUD embedding, options are:
|
|
|
|
1. extract step() and call from an async loop with awaits at IO points
|
|
2. run each ZMachine in a thread (viable since state is instance-based)
|
|
|
|
compared to viola's similar tight loop, the refactor here is actually easier
|
|
because the body is simpler (no interrupt handling, no recursive execloop).
|
|
|
|
|
|
7. memory/cleanup — clean, no leak risk
|
|
=========================================
|
|
|
|
memory is a bytearray, fixed size, bounded by story file.
|
|
ZMachine.__init__ creates two copies (pristine + working).
|
|
|
|
ZStackManager uses a Python list as call stack. frames are popped in
|
|
finish_routine(). bounded by Z-machine stack design.
|
|
|
|
ZStringFactory, ZCharTranslator, ZLexer all pre-load from story file at init.
|
|
bounded by file content.
|
|
|
|
ZObjectParser does not cache anything — every access reads directly from
|
|
ZMemory.
|
|
|
|
one minor leak: quetzal.py QuetzalParser stores self._file and closes it
|
|
manually but does not use a context manager. exception during parsing would
|
|
leak the file handle.
|
|
|
|
compared to viola's 5+ unbounded growth patterns (undo stack, command history,
|
|
routine cache, word cache, object caches), zvm is dramatically cleaner.
|
|
|
|
|
|
8. object tree API — complete read, mostly complete write
|
|
==========================================================
|
|
|
|
zobjectparser.py public API:
|
|
|
|
read:
|
|
- get_attribute(objectnum, attrnum) — single attribute (0/1)
|
|
- get_all_attributes(objectnum) — list of all set attribute numbers
|
|
- get_parent(objectnum) — parent object number
|
|
- get_child(objectnum) — first child
|
|
- get_sibling(objectnum) — next sibling
|
|
- get_shortname(objectnum) — object's short name as string
|
|
- get_prop(objectnum, propnum) — property value
|
|
- get_prop_addr_len(objectnum, propnum) — property address and length
|
|
- get_all_properties(objectnum) — dict of all properties
|
|
- describe_object(objectnum) — debug pretty-printer
|
|
|
|
write:
|
|
- set_parent(objectnum, new_parent_num)
|
|
- set_child(objectnum, new_child_num)
|
|
- set_sibling(objectnum, new_sibling_num)
|
|
- insert_object(parent_object, new_child) — handles unlinking
|
|
- set_property(objectnum, propnum, value)
|
|
|
|
missing: set_attribute() and clear_attribute(). the parser can read attributes
|
|
but cannot set them. would need to be added.
|
|
|
|
bug in insert_object(): sibling walk loop at line 273 never advances current
|
|
or prev — would infinite-loop. needs fix.
|
|
|
|
**critical for MUD embedding**: many CPU opcodes that USE the parser are
|
|
unimplemented at the CPU level:
|
|
|
|
- op_test_attr, op_set_attr, op_clear_attr — not wired up
|
|
- op_get_sibling — not wired up (parser method exists)
|
|
- op_jin (test parent) — not wired up
|
|
- op_remove_obj, op_print_obj — not wired up
|
|
- op_get_prop_addr, op_get_next_prop, op_get_prop_len — not wired up
|
|
|
|
the parser infrastructure is there and correct. the CPU just doesn't call it.
|
|
|
|
compared to viola's zcode/objects.py which has working accessors wired through
|
|
all opcodes, zvm has a better parser design but the plumbing is incomplete.
|
|
|
|
|
|
9. save/restore — parse works, write is stubbed
|
|
=================================================
|
|
|
|
quetzal.py QuetzalParser can parse Quetzal save files:
|
|
- IFhd chunks (metadata) — working
|
|
- CMem chunks (compressed memory) — working
|
|
- UMem chunks (uncompressed memory) — has a bug (wrong attribute name)
|
|
- Stks chunks (stack frames) — working
|
|
|
|
QuetzalWriter is almost entirely stubbed. all three data methods return "0".
|
|
file writing logic exists but writes nonsense.
|
|
|
|
all save/restore CPU opcodes (op_save, op_restore, op_save_v4, op_restore_v4,
|
|
op_save_v5, op_restore_v5, op_save_undo, op_restore_undo) raise
|
|
ZCpuNotImplemented.
|
|
|
|
alternative for MUD: since ZMemory is a bytearray, could snapshot/restore
|
|
raw memory + stack state directly without Quetzal format.
|
|
|
|
compared to viola's working quetzal implementation, zvm's save system needs
|
|
significant work.
|
|
|
|
|
|
10. test suite — minimal
|
|
=========================
|
|
|
|
5 registered test modules:
|
|
|
|
- bitfield_tests.py — 6 tests, BitField bit manipulation
|
|
- zscii_tests.py — 4 tests, string encoding/decoding
|
|
- lexer_tests.py — 3 tests, dictionary parsing
|
|
- quetzal_tests.py — 2 tests, save file parsing
|
|
- glk_tests.py — 7 tests, requires compiled CheapGlk .so
|
|
|
|
not tested at all: ZCpu (no opcode tests), ZMemory, ZObjectParser,
|
|
ZStackManager, ZOpDecoder, ZStreamManager.
|
|
|
|
all tests that need a story file use stories/curses.z5 with hardcoded paths.
|
|
|
|
compared to viola which has no tests at all, zvm has some but they don't
|
|
cover the critical subsystems.
|
|
|
|
|
|
11. dependencies — zero
|
|
========================
|
|
|
|
pure stdlib. setup.py declares no dependencies. python >= 3.6.
|
|
|
|
uses: logging, random, time, itertools, re, chunk, os, sys.
|
|
ctypes only for optional Glk native integration.
|
|
|
|
both viola and zvm are pure stdlib. no advantage either way.
|
|
|
|
|
|
12. completeness — the dealbreaker
|
|
====================================
|
|
|
|
of ~108 Z-machine opcodes, zvm implements approximately 46. the remaining ~62
|
|
are stubbed with ZCpuNotImplemented or have empty bodies.
|
|
|
|
unimplemented opcodes include fundamentals:
|
|
|
|
- ALL input opcodes (op_sread, op_aread) — cannot accept player input
|
|
- ALL save/restore opcodes — cannot save or load games
|
|
- critical object opcodes (test_attr, set_attr, clear_attr, get_sibling,
|
|
remove_obj, print_obj at CPU level)
|
|
- many branch/comparison ops
|
|
- string printing variants
|
|
|
|
the interpreter cannot execute any real interactive fiction game to
|
|
completion. it cannot reach the first prompt of Zork.
|
|
|
|
|
|
verdict — comparison with viola
|
|
================================
|
|
|
|
+---------------------+--------------+----------------+
|
|
| criterion | zvm | viola |
|
|
+---------------------+--------------+----------------+
|
|
| IO abstraction | excellent | needs adapter |
|
|
| global state | mostly clean | deeply tangled |
|
|
| multi-instance | structurally | process-only |
|
|
| error handling | exceptions | sys.exit() |
|
|
| memory leaks | none | 5+ patterns |
|
|
| object tree parser | complete | complete |
|
|
| object tree opcodes | ~half wired | all wired |
|
|
| opcode coverage | ~46/108 | all V1-V5 |
|
|
| can run a game | NO | YES |
|
|
| input handling | abstracted | working |
|
|
| save/restore | parse only | working |
|
|
| dependencies | zero | zero |
|
|
| tests | minimal | none |
|
|
| maintenance | abandoned | abandoned |
|
|
+---------------------+--------------+----------------+
|
|
|
|
zvm has the architecture you'd want. viola has the implementation you'd need.
|
|
|
|
zvm is a well-designed skeleton — clean IO abstraction, instance-based state,
|
|
proper exceptions, no memory leaks. but it's roughly half-built. finishing the
|
|
62 missing opcodes is weeks of work equivalent to writing a new interpreter,
|
|
except you're also debugging someone else's partial implementation.
|
|
|
|
viola is a working interpreter with terrible architecture for embedding —
|
|
global state everywhere, sys.exit() in error paths, pygame hardwired. but it
|
|
can run Zork right now. the refactoring targets are known and bounded.
|
|
|
|
|
|
pragmatic path
|
|
===============
|
|
|
|
the "moldable world" vision (levels 3-5 from the design discussion) requires
|
|
being inside the interpreter with access to the object tree. both interpreters
|
|
have the parser infrastructure for this.
|
|
|
|
option A — fix viola's embedding problems:
|
|
1. patch error.fatal() to raise (small)
|
|
2. swap pygame IO for telnet adapter (medium)
|
|
3. add cleanup hooks for caches (small)
|
|
4. subprocess isolation handles global state (free)
|
|
total: working IF in a MUD, with known limitations on multi-instance
|
|
|
|
option B — finish zvm's implementation:
|
|
1. implement ~62 missing opcodes (large)
|
|
2. fix insert_object bug (small)
|
|
3. add set_attribute/clear_attribute to parser (small)
|
|
4. complete save writer (medium)
|
|
total: clean embeddable interpreter, but weeks of opcode work first
|
|
|
|
option C — write our own interpreter:
|
|
designed for embedding from day one. state is a first-class object. object
|
|
tree is an API. multiple games in one process. but it's the longest path and
|
|
testing against real games is the hard part.
|
|
|
|
option D — hybrid:
|
|
use zvm's architecture (ZUI interface, exception model, instance-based state)
|
|
as the skeleton. port viola's working opcode implementations into it. gets
|
|
the clean design with the working code. medium effort, high reward.
|
|
|
|
the hybrid path is probably the most interesting. zvm got the hard design
|
|
decisions right. viola got the hard implementation work done. merging the two
|
|
is less work than either finishing zvm or refactoring viola.
|