Add z-machine garbled output investigation doc and debug script

This commit is contained in:
Jared Miller 2026-02-09 22:51:52 -05:00
parent 127ca5f56e
commit f8f5ac7ad0
Signed by: shmup
GPG key ID: 22B5C6D66A38B06C
2 changed files with 427 additions and 0 deletions

View file

@ -0,0 +1,238 @@
Z-Machine Garbled Output Investigation
========================================
Context
-------
Branch: ``zmachine-zork1-integration``
The hybrid z-machine interpreter runs Zork 1 but produces garbled output:
- Room name "West of House" prints correctly
- Room description shows "er " instead of full text
- After "look" command, output starts correctly then becomes garbage with ? characters
- "There is a small mailbox here." prints correctly
Investigation
-------------
The z-string decoder (ZStringTranslator, ZCharTranslator, ZsciiTranslator in
zstring.py) is correct. The abbreviation table loads correctly. All 96
abbreviations decode to the right text.
The diagnostic script (``scripts/debug_zstrings.py``) revealed the problem:
- ``op_print_paddr paddr=0x0051 (byte=0x00a2)`` decodes to "er "
- Address 0x00a2 is in the abbreviation table area, not where the room
description should be
- The z-string at 0x00a2 really is "er " - the decoder is right, the address
is wrong
Root cause: bugs in ``zobjectparser.py`` property parsing.
Bugs Found
----------
All bugs are in ``zobjectparser.py``.
Bug 1: Off-by-one in shortname skip (ROOT CAUSE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``get_prop_addr_len()`` line 392::
addr += 2 * self._memory[addr]
This skips the shortname data but not the length byte. Compare with
``get_all_properties()`` which does it correctly::
shortname_length = self._memory[addr]
addr += 1 # skip length byte
addr += 2 * shortname_length # skip shortname data
The off-by-one means the property table scan starts 1 byte too early, reading
the last byte of the shortname z-string as if it were a property header. This
corrupts every subsequent property lookup.
Bug 2: BitField slice wrong width for V3 property number
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``get_prop_addr_len()`` line 399 and ``get_all_properties()`` line 446::
pnum = bf[4:0]
BitField slice ``[4:0]`` has start > stop, so it swaps to ``[0:4]`` = 4 bits.
But V3 property numbers are 5 bits (bits 0-4). Should be ``bf[0:5]``.
This means property numbers 16-31 are read as 0-15 (bit 4 is lost).
Bug 3: BitField slice wrong width for V3 property size
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``get_prop_addr_len()`` line 400, ``get_all_properties()`` line 447, and
``get_property_length()`` line 555::
size = bf[7:5] + 1
BitField slice ``[7:5]`` swaps to ``[5:7]`` = 2 bits. But V3 property sizes are
3 bits (bits 5-7). Should be ``bf[5:8]``.
This means properties with size > 4 bytes are read with the wrong size (e.g.,
size 5 reads as size 1), causing the property iterator to skip too few bytes.
Bug 4: Same BitField slice issue in V5 path
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``get_prop_addr_len()`` lines 409, 413::
pnum = bf[5:0] # swaps to [0:5] = 5 bits, should be [0:6] = 6 bits
size = bf2[5:0] # swaps to [0:5] = 5 bits, should be [0:6] = 6 bits
Compare with ``get_all_properties()`` V5 path which correctly uses ``bf[0:6]``.
Also in ``get_property_length()`` line 563: same ``bf2[5:0]`` issue.
How to Fix
----------
All fixes are in ``zobjectparser.py``:
1. Line 392: change ``addr += 2 * self._memory[addr]`` to::
addr += 1 + 2 * self._memory[addr]
2. Lines 399, 446: change ``bf[4:0]`` to ``bf[0:5]``
3. Lines 400, 447, 555: change ``bf[7:5]`` to ``bf[5:8]``
4. Lines 409, 413, 563: change ``bf[5:0]`` to ``bf[0:6]``
The pattern: BitField uses Python slice semantics ``[start:stop)`` where the
number of bits extracted is ``stop - start``. The buggy code used ``[high:low]``
notation thinking it would extract bits high through low inclusive, but the swap
logic makes it extract fewer bits than intended.
This is the same class of bug as the write_global fix in commit e5329d6
(BitField slice extracting 7 bits not 8).
Diagnostic Tools
----------------
``scripts/debug_zstrings.py`` traces all z-string decoding with caller info.
Run with::
echo -e "look\nquit\ny" | python3 scripts/debug_zstrings.py 2>/tmp/trace.log
Trace goes to stderr, game output to stdout.
Session 2: Additional Bugs Found and Fixed
-------------------------------------------
Bug 5: Same off-by-one in get_next_property
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``get_next_property()`` had the same shortname skip bug as Bug 1::
addr += 2 * self._memory[addr]
Same fix: ``addr += 1 + 2 * self._memory[addr]``
Also had the same BitField slice bugs (Bugs 2, 4) for the V3/V5
property number return values.
Bug 6: get_property_length reads wrong byte for V5 two-byte headers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the V5 two-byte property header path, the code read the size from
the first header byte (which contains the property number) instead of
the second header byte (which contains the size)::
bf2 = BitField(self._memory[data_address - 2]) # WRONG: first byte
size = bf2[0:6]
The second byte was already read into ``bf`` at ``data_address - 1``.
Fix: use ``bf[0:6]`` directly instead of creating ``bf2``.
Bug 7: op_dec_chk and op_inc_chk use unsigned comparison
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In ``zcpu.py``, both ``op_dec_chk`` and ``op_inc_chk`` compared 16-bit
values as unsigned integers instead of signed. The Z-machine spec
requires signed comparison.
Effect: when a counter decremented from 0 to 65535 (wrapping), the check
``65535 < 0`` evaluated false (unsigned), causing infinite loops in Z-code
that counted down past zero. This made Zork 1's word-printing loop run
forever, dumping raw memory bytes to the screen as characters.
Fix: wrap both operands with ``_make_signed()`` before comparing.
What's Fixed Now
-----------------
After all fixes:
- Room descriptions display correctly on startup
- The infinite garbage output loop is fixed (dec_chk now works with signed)
- ``just check`` passes (516 tests)
What's Still Broken
--------------------
Commands like "look", "open mailbox", "quit" all produce::
You used the word "look" in a way that I don't understand.
The parser finds words in the dictionary (dict_addr is correct, e.g.
"look" at 0x44f2) and stores them in the parse buffer correctly. But the
game's Z-code parser can't match them to valid verb syntax.
Instruction tracing after ``op_sread`` shows the game's parser:
1. Reads parse buffer correctly (loadw(0x2551, 1) -> dict_addr 17650)
2. Checks dict_addr against specific word addresses (je comparisons) -
these are special-word checks, none match
3. Calls a function with the dict_addr that reads ``loadb(17650, 4)`` to
get the word's flags byte (0x41) from the dictionary entry
4. Tests individual flag bits (16, 64, 8, 32, 128, 4) against 0x41
5. Bit 6 (64) matches - the function extracts data from byte 5 (0xc2)
6. Despite finding the word type, the parser still rejects the command
The bug is likely in how the parser uses the extracted word-type data to
look up verb syntax. Possible areas:
- The ``and`` / ``je`` sequence that extracts a sub-field from the flags
byte may produce the wrong value
- The verb number extracted from byte 5 (0xc2 = 194) may need signed
interpretation or bit masking
- Another opcode (``op_jl``, ``op_jg``, ``op_sub``) may have a similar
unsigned-vs-signed bug
- The ``call`` opcode may have incorrect return value handling
Debugging approach for next session:
1. Extend the instruction trace past 100 steps to see what happens after
the word-type function returns
2. Compare the execution trace against dfrotz instruction trace (add
``-t`` flag to dfrotz for tracing)
3. Check ``op_jl`` and ``op_jg`` for the same unsigned comparison bug
4. Check ``op_sub`` and ``op_mul`` for signed arithmetic issues
5. Look at how the ``and`` result (step 21 in trace) flows through to
the verb syntax lookup
Diagnostic Tools
----------------
``scripts/debug_zstrings.py`` traces all z-string decoding with caller info.
Run with::
echo -e "look\nquit\ny" | python3 scripts/debug_zstrings.py 2>/tmp/trace.log
Trace goes to stderr, game output to stdout.
To add instruction tracing after sread, add to ``zcpu.py``::
# In step(), after decoding:
print(f"PC={pc:#06x} {handler_name}({', '.join(str(a) for a in args)})",
file=sys.stderr)

189
scripts/debug_zstrings.py Executable file
View file

@ -0,0 +1,189 @@
#!/usr/bin/env python3
"""Diagnostic script to trace z-string decoding in Zork 1.
This script loads Zork 1 and traces the z-string decoding pipeline:
1. Dumps all abbreviations from the abbreviation table
2. Monkey-patches ZStringFactory.get() to log every string decode
3. Runs the game with piped input to capture string decoding during gameplay
"""
import inspect
import sys
from pathlib import Path
def main():
# Add src to path so we can import mudlib
project_root = Path(__file__).parent.parent
sys.path.insert(0, str(project_root / "src"))
from mudlib.zmachine import ZMachine
from mudlib.zmachine.trivialzui import create_zui
from mudlib.zmachine.zstring import ZStringFactory
print("=" * 80)
print("Z-STRING DECODING DIAGNOSTIC FOR ZORK 1")
print("=" * 80)
print()
story_path = project_root / "content" / "stories" / "zork1.z3"
if not story_path.exists():
print(f"ERROR: Story file not found at {story_path}")
sys.exit(1)
with open(story_path, "rb") as f:
story_bytes = f.read()
print(f"Loaded {len(story_bytes)} bytes from {story_path.name}")
print()
# Create ZUI and ZMachine
ui = create_zui()
zmachine = ZMachine(story_bytes, ui, debugmode=False)
# Access the string factory
string_factory = zmachine._stringfactory
# PART 1: Dump all abbreviations
print("=" * 80)
print("ABBREVIATION TABLE DUMP")
print("=" * 80)
print()
abbrevs = string_factory.zchr._abbrevs
if not abbrevs:
print("No abbreviations found (version 1 game or failed to load)")
else:
print(f"Total abbreviations: {len(abbrevs)}")
print()
for table in range(3):
print(f"--- Table {table} ---")
for index in range(32):
key = (table, index)
if key in abbrevs:
zscii_codes = abbrevs[key]
text = string_factory.zscii.get(zscii_codes)
# Show ZSCII codes (truncate if too long)
if len(zscii_codes) > 20:
zscii_display = str(zscii_codes[:20]) + "..."
else:
zscii_display = str(zscii_codes)
# Show text (truncate if too long)
if len(text) > 60:
text_display = repr(text[:60]) + "..."
else:
text_display = repr(text)
print(f" [{table},{index:2d}] ZSCII={zscii_display}")
print(f" Text={text_display}")
print()
# PART 2: Monkey-patch ZStringFactory.get() to trace all string decodes
print("=" * 80)
print("STRING DECODING TRACE")
print("=" * 80)
print()
print("Format: [opcode args] -> addr")
print(" zchars -> zscii -> text")
print()
def traced_get(self, addr):
# Get caller information using inspect
stack = inspect.stack()
caller_name = "unknown"
caller_info = ""
# Walk up the stack to find the opcode function
for frame_info in stack[1:]: # Skip our own frame
func_name = frame_info.function
if func_name.startswith("op_"):
caller_name = func_name
# Try to extract arguments from the frame
frame_locals = frame_info.frame.f_locals
# For op_print_addr, show the addr argument
if func_name == "op_print_addr" and "addr" in frame_locals:
caller_info = f" addr={frame_locals['addr']:#06x}"
# For op_print_paddr, show packed address and conversion
elif func_name == "op_print_paddr" and "string_paddr" in frame_locals:
paddr = frame_locals["string_paddr"]
# Show both the packed address and what it converts to
# For z3, byte_addr = paddr * 2
byte_addr = paddr * 2
caller_info = f" paddr={paddr:#06x} (byte={byte_addr:#06x})"
# For op_print_obj, show the object number
elif func_name == "op_print_obj" and "obj" in frame_locals:
caller_info = f" obj={frame_locals['obj']}"
# For op_print, note it's inline (z-string follows opcode)
elif func_name == "op_print":
caller_info = " (inline z-string)"
break
# Get z-chars from ZStringTranslator
zchars = self.zstr.get(addr)
# Convert z-chars to ZSCII codes
zscii_codes = self.zchr.get(zchars)
# Convert ZSCII to Unicode text
text = self.zscii.get(zscii_codes)
# Log to stderr so it doesn't interfere with game output
# Truncate long lists/strings for readability
if len(zchars) > 20:
zchars_display = str(zchars[:20]) + f"... (len={len(zchars)})"
else:
zchars_display = str(zchars)
if len(zscii_codes) > 20:
zscii_display = str(zscii_codes[:20]) + f"... (len={len(zscii_codes)})"
else:
zscii_display = str(zscii_codes)
text_display = repr(text[:80]) + "..." if len(text) > 80 else repr(text)
print(f"[{caller_name}{caller_info}] -> {addr:#06x}", file=sys.stderr)
print(f" zchars={zchars_display}", file=sys.stderr)
print(f" zscii={zscii_display}", file=sys.stderr)
print(f" text={text_display}", file=sys.stderr)
print(file=sys.stderr)
return text
# Apply the monkey patch
ZStringFactory.get = traced_get # type: ignore[assignment]
# PART 3: Run the game with piped input
print("=" * 80)
print("GAME OUTPUT (running interpreter with tracing enabled)")
print("=" * 80)
print()
print(
"Note: String decode traces are written to stderr and "
"interleaved with game output."
)
print("To separate them, run: python3 scripts/debug_zstrings.py 2>trace.log")
print()
print(
"To see just the trace: "
"python3 scripts/debug_zstrings.py 2>&1 >/dev/null | grep zchars"
)
print()
print("Press Ctrl+C to stop the game.")
print()
try:
zmachine.run()
except KeyboardInterrupt:
print("\n[Interrupted by user]")
except Exception as e:
print(f"\n[Game ended: {type(e).__name__}: {e}]")
print()
print("=" * 80)
print("DIAGNOSTIC COMPLETE")
print("=" * 80)
if __name__ == "__main__":
main()