Compare commits

..

6 commits

Author SHA1 Message Date
e72f13e78a
Add provenance documentation for test story files 2026-02-10 17:10:29 -05:00
6d7b404365
Add pytest regression harness for z-machine game compatibility
Implements Phase 4 of the z-machine compatibility plan.

Creates automated regression tests that smoke-test all supported games
(V3, V5, V8) by loading each story, executing basic commands, and verifying
the interpreter doesn't crash.

Key features:
- Parametrized test covering 7 games (zork1, curses, photopia, Tangle,
  shade, LostPig, anchor)
- QuietScreen class that disables [MORE] prompts for unattended testing
- AutoInputStream that auto-feeds commands then exits cleanly
- Tests verify: no crashes, unimplemented opcodes, and minimum instruction count
- All tests pass in ~2 seconds

Tests skip gracefully if story files aren't present, making this safe to
run in CI or on systems without all game files.
2026-02-10 17:10:29 -05:00
6d29ec00fb
Implement stub opcodes for game compatibility
set_colour, piracy, erase_line, get_cursor, not_v5, print_table
2026-02-10 17:10:29 -05:00
b08ce668a6
Add smoke test script for z-machine game compatibility 2026-02-10 17:10:04 -05:00
243a44e3fb
Add plan for zvm compatability 2026-02-10 16:50:23 -05:00
bc1a2e5489
Add undo command support 2026-02-10 16:49:46 -05:00
6 changed files with 685 additions and 17 deletions

View file

@ -0,0 +1,52 @@
z-machine story files
=====================
story files used for interpreter compatibility testing.
binary files are gitignored (*.z* pattern).
zork1.z3
author: Infocom (Marc Blank, Dave Lebling)
version: 3 (V3)
notes: classic Infocom title, used for initial interpreter development
LostPig.z8
author: Admiral Jota (Grunk)
version: 8 (V8)
source: https://ifarchive.org/if-archive/games/zcode/LostPig.z8
license: freeware
notes: modern Inform game, used for V5/V8 opcode development
curses.z5
author: Graham Nelson
version: 5 (V5)
source: https://ifarchive.org/if-archive/games/zcode/curses.z5
license: freeware
notes: first Inform game (1993), exercises many opcodes
photopia.z5
author: Adam Cadre
version: 5 (V5)
source: https://ifarchive.org/if-archive/games/zcode/photopia.z5
license: freeware
notes: narrative-heavy, triggered op_set_colour implementation
Tangle.z5
author: Andrew Plotkin
version: 5 (V5)
source: https://ifarchive.org/if-archive/games/zcode/Tangle.z5
license: freeware
notes: "Spider and Web" - clever parser tricks, unreliable narrator
shade.z5
author: Andrew Plotkin
version: 5 (V5)
source: https://ifarchive.org/if-archive/games/zcode/shade.z5
license: freeware
notes: small atmospheric game, good smoke test
anchor.z8
author: Michael Gentry
version: 8 (V8)
source: https://ifarchive.org/if-archive/games/zcode/anchor.z8
license: freeware (original z-machine version)
notes: "Anchorhead" - horror, heavy object manipulation, largest test game

View file

@ -272,7 +272,7 @@ Concrete next steps, roughly ordered. Update as items get done.
- [x] wire V8 games to MUD: ``play.py`` routes .z3/.z5/.z8 to ``EmbeddedIFSession``. Lost Pig playable via ``play lostpig``. fixed upper window leak: V5+ games write room names to window 1 (status line) via ``select_window``. ``MudScreen`` now tracks the active window and suppresses writes to the upper window, preventing status line text from appearing in game output.
- [ ] implement real save_undo: currently stubs returning -1 ("not available"). a proper implementation needs in-memory state snapshots (dynamic memory + call stack). Lost Pig works without undo but players expect it.
- [x] implement save_undo/restore_undo: in-memory state snapshots (dynamic memory + call stack + PC). save_undo reads store_addr first (advance PC past store byte), captures snapshot before writing result 1, restore writes 2 to save_undo's store_addr (fork()-like convention). snapshot consumed after restore (no double undo). also added ``undo`` command wiring. (done — see commit c91d6a4)
milestone — Zork 1 playable in hybrid interpreter
--------------------------------------------------

View file

@ -0,0 +1,239 @@
zmachine game compatibility plan
=================================
goal: round out the hybrid z-machine interpreter by testing against a variety
of freely available IF games. find what breaks, fix it, build confidence that
the interpreter handles the spec correctly rather than just the two games we
built it against.
background
----------
the interpreter was shaped by tracing two specific games:
- Zork 1 (V3, 69 opcodes) — the Infocom classic
- Lost Pig (V8/V5, 61 opcodes) — modern Inform, 101K instructions
this means the implementation is biased toward what those two games exercise.
other games will hit different opcode combinations, edge cases in string
encoding, parser behaviors, object tree structures, and screen model usage.
version coverage
~~~~~~~~~~~~~~~~
V3, V5, and V8 cover essentially everything worth playing:
- V3: the Infocom catalog (Zork, Hitchhiker's, Planetfall, etc)
- V5: most modern Inform-compiled games (Photopia, Curses, Spider and Web)
- V8: V5 with x8 packed addresses (Lost Pig, some larger Inform games)
versions we are NOT targeting:
- V1/V2: original mainframe Zork only. almost nothing uses these.
- V4: tiny transitional version. Trinity is the notable game. maybe 5 games
total ever published. if a V4 game trips a bug we'll fix it (the version
gates already include V4 in the 4-5 range) but we're not seeking them out.
- V6: graphical z-machine. mouse, pictures, complex screen model. only a
handful of late Infocom titles (Zork Zero, Shogun, Arthur, Journey). out
of scope — the MUD is text.
- V7: almost nonexistent. maybe 1-2 games ever. not worth thinking about.
known stub opcodes
~~~~~~~~~~~~~~~~~~
the dispatch table is complete for V3/V5/V8, but ~12 opcodes have "TODO"
docstrings — they're registered but may not work correctly::
op_set_colour — color setting (can likely remain a no-op for MUD)
op_throw — throw to catch frame (needs real implementation)
op_print_ret — print embedded string + newline + return true
op_save_v4 — V4 save (store result, not branch)
op_restore_v4 — V4 restore (store result, not branch)
op_piracy — always branch true (standard behavior, stub is fine)
op_sread_v4 — V4 input (like V3 sread but with timing)
op_erase_line — erase current line (display op, can be no-op)
op_get_cursor — get cursor position (display op, needs stub)
op_not_v5 — bitwise NOT, VAR form (should be trivial)
op_encode_text — encode ZSCII to dictionary format
op_print_table — formatted table output
some of these are fine as no-ops (color, erase_line). others need real
implementations if games use them (throw, print_ret, encode_text).
phase 1 — acquire games
------------------------
download freely available z-machine games from the IF Archive
(https://ifarchive.org). all games below are free to distribute.
priority targets (well-known, diverse, good coverage):
V3 games::
Hitchhiker's Guide to the Galaxy — Infocom, nasty parser edge cases
NOTE: check if freely available.
Infocom titles are abandonware
but not legally free. skip if
we can't get a legit copy.
V5 games::
Curses — Graham Nelson, 1993. first Inform game. large,
exercises many opcodes. freely available.
Photopia — Adam Cadre, 1998. minimal puzzles, narrative
heavy, lots of text output. free.
Spider and Web — Andrew Plotkin, 1998. notoriously clever parser
tricks, unreliable narrator mechanic. free.
Shade — Andrew Plotkin, 2000. small, atmospheric,
good smoke test. free.
Anchorhead — Michael Gentry, 1998. horror, larger game,
heavy object manipulation. original z-machine
version is free (later Inform 7 version is
commercial — use the original).
Bronze — Emily Short, 2006. tutorial-style, good for
testing standard patterns. free.
Counterfeit Monkey — Emily Short, 2012. complex, large, exercises
advanced Inform features. free.
V8 games::
(Lost Pig already working — look for other V8 titles on IF Archive
to broaden coverage if any exist)
also worth checking: games compiled with different Inform versions (Inform 5,
6, 7-to-Z) to catch compiler-specific patterns.
the agent doing this work should:
1. find each game on ifarchive.org or the author's site
2. download the story file (.z3, .z5, .z8)
3. verify the z-machine version byte (byte 0 of story file)
4. place in content/stories/ with a note about source/license
phase 2 — smoke test each game
-------------------------------
for each acquired game, run it through the interpreter and record results.
use the existing trace infrastructure::
scripts/trace_zmachine.py — V3 opcode tracing
scripts/trace_lostpig.py — V5/V8 opcode tracing
for each game:
1. run the trace script (or adapt it for the game)
2. record: how many instructions execute, which opcodes are used,
where it crashes (if it crashes)
3. try interactive play for at least 10-15 commands
4. categorize the result:
- WORKS: plays correctly, no crashes
- CRASHES: hits an unimplemented or buggy opcode (record which one)
- MISBEHAVES: runs but output is wrong (garbled text, wrong responses,
display issues)
- BLOCKS ON INPUT: hangs or mishandles input in some way
build a results table like::
game | version | result | notes
------------------|---------|------------|---------------------------
Curses | V5 | CRASHES | op_encode_text at 0x1234
Photopia | V5 | WORKS | 45K instructions to prompt
Spider and Web | V5 | MISBEHAVES | status line garbled
...
phase 3 — fix failures
-----------------------
group failures by type and fix them:
missing/stub opcodes
~~~~~~~~~~~~~~~~~~~~
for each "TODO" opcode that a real game exercises:
1. read the z-machine spec (``zmach06e.pdf`` or inform-fiction.org/zmachine)
2. implement per spec
3. add a unit test
4. verify the game that triggered it now works
spec compliance bugs
~~~~~~~~~~~~~~~~~~~~
for opcodes that are implemented but behave wrong:
1. compare our implementation against the spec
2. check edge cases (signed vs unsigned, overflow, zero-length strings)
3. fix and add regression test
display/screen model issues
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
V5+ games use the screen model more aggressively than Zork or Lost Pig:
- window splitting, cursor positioning, text styles
- status line formatting
- output stream selection (stream 3 = memory table)
these may need MudScreen improvements. the MUD doesn't need pixel-perfect
screen emulation but it needs to not crash and should produce readable output.
string encoding edge cases
~~~~~~~~~~~~~~~~~~~~~~~~~~
different games exercise different parts of ZSCII:
- alphabet table switching (A0/A1/A2)
- abbreviations (V3 has 32, V5+ has 96)
- unicode extensions
- custom alphabet tables (some games define their own)
phase 4 — regression harness
-----------------------------
once games are working, build automated smoke tests:
- for each game, a script that feeds N commands and checks for crashes
- run as part of ``just check`` or a separate ``just smoke`` target
- catches regressions when we change the interpreter
the test doesn't need to verify game output is correct — just that the
interpreter doesn't crash, produces output, and reaches the input prompt.
something like::
@pytest.mark.parametrize("game,commands", [
("zork1.z3", ["look", "open mailbox", "read leaflet"]),
("curses.z5", ["look", "inventory", "north"]),
("photopia.z5", ["look", "yes"]),
])
def test_game_smoke(game, commands):
"""Run game through interpreter, feed commands, verify no crash."""
...
success criteria
----------------
- 5+ freely available games beyond Zork 1 and Lost Pig run without crashes
- all "TODO" stub opcodes that real games exercise have real implementations
- automated smoke tests prevent regressions
- the interpreter handles V3 and V5/V8 games from different compilers
non-goals
---------
- pixel-perfect screen emulation (we're a text MUD)
- V6 graphical games
- V1/V2 support
- multiplayer z-machine (separate effort, see mojozork-audit.rst)
- perfect Infocom compatibility (we care about freely available games first)
related documents
-----------------
- ``docs/how/if-journey.rst`` — integration vision and roadmap
- ``docs/how/mojozork-audit.rst`` — multiplayer z-machine audit
- ``docs/how/zmachine-performance.rst`` — performance profiling and optimization
- ``scripts/trace_zmachine.py`` — V3 opcode tracing
- ``scripts/trace_lostpig.py`` — V5/V8 opcode tracing

248
scripts/smoke_test_games.py Normal file
View file

@ -0,0 +1,248 @@
#!/usr/bin/env -S uv run --script
"""Smoke test all Z-machine games.
Runs each game through a series of basic commands, collecting opcode coverage
and detecting crashes.
"""
# ruff: noqa: E402
import contextlib
import sys
from collections import Counter
from dataclasses import dataclass
from pathlib import Path
project_root = Path(__file__).parent.parent
sys.path.insert(0, str(project_root / "src"))
from mudlib.zmachine import ZMachine, zopdecoder, zstream, zui
from mudlib.zmachine.trivialzui import (
TrivialAudio,
TrivialFilesystem,
TrivialScreen,
)
from mudlib.zmachine.zcpu import (
ZCpuNotImplemented,
ZCpuQuit,
ZCpuRestart,
)
class AutoInputStream(zstream.ZInputStream):
"""Input stream that auto-feeds commands."""
def __init__(self, commands=None):
super().__init__()
self._commands = commands or ["look", "inventory", "north", "south", "look"]
self._input_count = 0
def read_line(self, *args, **kwargs):
if self._input_count >= len(self._commands):
raise ZCpuQuit
cmd = self._commands[self._input_count]
self._input_count += 1
return cmd
def read_char(self, *args, **kwargs):
if self._input_count >= len(self._commands):
raise ZCpuQuit
cmd = self._commands[self._input_count]
self._input_count += 1
# Return first character as ord
return ord(cmd[0]) if cmd else ord(" ")
@dataclass
class TestResult:
"""Result of testing a single game."""
filename: str
version: int
success: bool
steps: int
unique_opcodes: int
opcodes: Counter
error_type: str | None = None
error_message: str | None = None
error_pc: int | None = None
error_opcode: str | None = None
def test_game(story_path: Path, max_steps: int = 1_000_000) -> TestResult:
"""Test a single game, returning results."""
story_bytes = story_path.read_bytes()
version = story_bytes[0]
audio = TrivialAudio()
screen = TrivialScreen()
keyboard = AutoInputStream()
filesystem = TrivialFilesystem()
ui = zui.ZUI(audio, screen, keyboard, filesystem)
zm = ZMachine(story_bytes, ui)
opcodes_seen = Counter()
step_count = 0
error_type = None
error_message = None
error_pc = None
error_opcode = None
try:
while step_count < max_steps:
pc = zm._cpu._opdecoder.program_counter
(opcode_class, opcode_number, operands) = (
zm._cpu._opdecoder.get_next_instruction()
)
cls_str = zopdecoder.OPCODE_STRINGS.get(opcode_class, f"?{opcode_class}")
key = f"{cls_str}:{opcode_number:02x}"
try:
implemented, func = zm._cpu._get_handler(opcode_class, opcode_number)
except Exception as e:
error_type = "ILLEGAL"
error_message = str(e)
error_pc = pc
error_opcode = key
break
opcodes_seen[f"{key} ({func.__name__})"] += 1
if not implemented:
error_type = "UNIMPLEMENTED"
error_message = f"Opcode {key} -> {func.__name__}"
error_pc = pc
error_opcode = key
break
try:
func(zm._cpu, *operands)
except ZCpuQuit:
break
except ZCpuRestart:
break
except ZCpuNotImplemented as e:
error_type = "NOT_IMPLEMENTED"
error_message = str(e)
error_pc = pc
error_opcode = key
break
except Exception as e:
error_type = type(e).__name__
error_message = str(e)
error_pc = pc
error_opcode = key
# Dump trace for debugging
with contextlib.suppress(Exception):
zm._cpu._dump_trace()
break
step_count += 1
except KeyboardInterrupt:
error_type = "INTERRUPTED"
error_message = "Keyboard interrupt"
success = error_type is None
return TestResult(
filename=story_path.name,
version=version,
success=success,
steps=step_count,
unique_opcodes=len(opcodes_seen),
opcodes=opcodes_seen,
error_type=error_type,
error_message=error_message,
error_pc=error_pc,
error_opcode=error_opcode,
)
def main():
"""Run smoke tests on all games or a specified game."""
stories_dir = project_root / "content" / "stories"
if len(sys.argv) > 1:
# Test a specific game
story_path = Path(sys.argv[1])
if not story_path.exists():
print(f"ERROR: {story_path} not found")
sys.exit(1)
story_paths = [story_path]
else:
# Test all games
story_paths = sorted(stories_dir.glob("*.z[358]"))
if not story_paths:
print("No story files found")
sys.exit(1)
print(f"Testing {len(story_paths)} games...")
print()
results = []
for story_path in story_paths:
print(f"Testing {story_path.name}...", end=" ", flush=True)
result = test_game(story_path)
results.append(result)
if result.success:
print(f"OK ({result.steps} steps, {result.unique_opcodes} opcodes)")
else:
print(f"FAILED: {result.error_type}")
if result.error_opcode:
print(f" Opcode: {result.error_opcode}")
if result.error_pc is not None:
print(f" PC: {result.error_pc:#x}")
if result.error_message:
print(f" Message: {result.error_message}")
print()
# Print summary table
print()
print("=" * 80)
print("SUMMARY")
print("=" * 80)
print()
print(
f"{'Game':<20} {'Ver':<4} {'Status':<12} "
f"{'Steps':>8} {'Opcodes':>8} {'Error':<20}"
)
print("-" * 80)
for result in results:
status = "PASS" if result.success else "FAIL"
error = result.error_type or "-"
print(
f"{result.filename:<20} {result.version:<4} {status:<12} "
f"{result.steps:>8} {result.unique_opcodes:>8} {error:<20}"
)
print()
passed = sum(1 for r in results if r.success)
failed = len(results) - passed
print(f"Total: {len(results)} games, {passed} passed, {failed} failed")
# Print detailed error information
if failed > 0:
print()
print("=" * 80)
print("FAILED GAMES DETAILS")
print("=" * 80)
for result in results:
if not result.success:
print()
print(f"{result.filename} (V{result.version}):")
print(f" Error type: {result.error_type}")
print(f" Error message: {result.error_message}")
if result.error_pc is not None:
print(f" PC: {result.error_pc:#x}")
if result.error_opcode:
print(f" Opcode: {result.error_opcode}")
print(f" Steps completed: {result.steps}")
print(f" Unique opcodes seen: {result.unique_opcodes}")
if __name__ == "__main__":
main()

View file

@ -463,8 +463,8 @@ class ZCpu:
self._call(routine_addr, [arg1], False)
def op_set_colour(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
"""Set foreground and background colors (no-op for text MUD)."""
pass
def op_throw(self, *args):
"""TODO: Write docstring here."""
@ -716,8 +716,8 @@ class ZCpu:
self._branch(expected_checksum == actual_checksum)
def op_piracy(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
"""Anti-piracy check. Always branches true (all interpreters pass this)."""
self._branch(True)
## VAR opcodes (opcodes 224-255)
@ -913,16 +913,17 @@ class ZCpu:
self._ui.screen.erase_window(window_number)
def op_erase_line(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
"""Erase current line on screen (no-op for text MUD)."""
pass
def op_set_cursor(self, x, y):
"""Set the cursor position within the active window."""
self._ui.screen.set_cursor_position(x, y)
def op_get_cursor(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
def op_get_cursor(self, table_addr):
"""Get cursor position into table. For MUD, always write row=1, col=1."""
self._memory.write_word(table_addr, 1) # row
self._memory.write_word(table_addr + 2, 1) # col
def op_set_text_style(self, text_style):
"""Set the text style."""
@ -1004,9 +1005,10 @@ class ZCpu:
self._write_result(0)
self._branch(False)
def op_not_v5(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
def op_not_v5(self, value):
"""Bitwise NOT (VAR form). Same as op_not."""
result = ~value & 0xFFFF
self._write_result(result)
def op_call_vn(self, routine_addr, *args):
"""Call routine with up to 3 arguments and discard the result."""
@ -1055,7 +1057,12 @@ class ZCpu:
offset = pos + word_len
def op_encode_text(self, *args):
"""TODO: Write docstring here."""
"""Encode ZSCII text to Z-encoded string (V5+).
This opcode converts ZSCII text into Z-machine's packed text format
(3 characters per 2 bytes). Complex operation, rarely used.
Not implemented - will raise ZCpuNotImplemented if any game calls it.
"""
raise ZCpuNotImplemented
def op_copy_table(self, first, second, size):
@ -1080,9 +1087,13 @@ class ZCpu:
for i in range(count - 1, -1, -1):
self._memory[second + i] = self._memory[first + i]
def op_print_table(self, *args):
"""TODO: Write docstring here."""
raise ZCpuNotImplemented
def op_print_table(self, zscii_text, width, height=1, skip=0):
"""Formatted table printing (no-op for text MUD).
Spec: print width chars per line for height lines from zscii_text.
Skip bytes between rows. For now, no-op to avoid crashes.
"""
pass
def op_check_arg_count(self, arg_number):
"""Branch if the Nth argument was passed to the current routine."""

View file

@ -0,0 +1,118 @@
"""Regression tests for z-machine game compatibility.
Smoke tests a suite of games to ensure the interpreter can load them,
execute basic commands, and reach the input prompt without crashing.
"""
from pathlib import Path
import pytest
from mudlib.zmachine import ZMachine, zscreen, zstream, zui
from mudlib.zmachine.trivialzui import (
TrivialAudio,
TrivialFilesystem,
TrivialScreen,
)
from mudlib.zmachine.zcpu import ZCpuQuit, ZCpuRestart
STORIES_DIR = Path(__file__).parent.parent / "content" / "stories"
# Game test suite: (filename, commands to feed)
GAMES = [
("zork1.z3", ["look", "open mailbox", "read leaflet"]),
("curses.z5", ["look", "inventory", "north"]),
("photopia.z5", ["look", "yes", "look"]),
("Tangle.z5", ["look", "inventory", "north"]),
("shade.z5", ["look", "inventory", "look"]),
("LostPig.z8", ["look", "inventory", "north"]),
("anchor.z8", ["look", "inventory", "north"]),
]
class QuietScreen(TrivialScreen):
"""Screen for testing that never shows [MORE] prompts."""
def __init__(self):
super().__init__()
# Set infinite rows to prevent [MORE] prompts
self._rows = zscreen.INFINITE_ROWS
class AutoInputStream(zstream.ZInputStream):
"""Input stream that auto-feeds commands."""
def __init__(self, commands=None):
super().__init__()
self._commands = commands or []
self._input_count = 0
def read_line(self, *args, **kwargs):
if self._input_count >= len(self._commands):
raise ZCpuQuit
cmd = self._commands[self._input_count]
self._input_count += 1
return cmd
def read_char(self, *args, **kwargs):
if self._input_count >= len(self._commands):
raise ZCpuQuit
cmd = self._commands[self._input_count]
self._input_count += 1
# Return first character as ord
return ord(cmd[0]) if cmd else ord(" ")
@pytest.mark.parametrize("game,commands", GAMES, ids=[g[0] for g in GAMES])
def test_game_smoke(game, commands):
"""Run game through interpreter, feed commands, verify no crash."""
story_path = STORIES_DIR / game
if not story_path.exists():
pytest.skip(f"{game} not found in {STORIES_DIR}")
# Load story
story_bytes = story_path.read_bytes()
# Create test UI components
audio = TrivialAudio()
screen = QuietScreen()
keyboard = AutoInputStream(commands)
filesystem = TrivialFilesystem()
ui = zui.ZUI(audio, screen, keyboard, filesystem)
# Create interpreter
zm = ZMachine(story_bytes, ui)
# Step through instructions
step_count = 0
max_steps = 500_000 # Enough to get through several commands
try:
while step_count < max_steps:
(opcode_class, opcode_number, operands) = (
zm._cpu._opdecoder.get_next_instruction()
)
implemented, func = zm._cpu._get_handler(opcode_class, opcode_number)
# If unimplemented, that's a test failure
assert implemented, (
f"Unimplemented opcode {opcode_class}:{opcode_number:02x}"
)
# Execute instruction
func(zm._cpu, *operands)
step_count += 1
except ZCpuQuit:
# Normal exit (ran out of commands)
pass
except ZCpuRestart:
# Some games restart - this is fine for smoke test
pass
# Sanity check: at least some instructions executed
assert step_count >= 100, (
f"Only {step_count} instructions executed, expected at least 100"
)