diff --git a/docs/how/zmachine-garbled-output-investigation.rst b/docs/how/zmachine-garbled-output-investigation.rst new file mode 100644 index 0000000..340c5b1 --- /dev/null +++ b/docs/how/zmachine-garbled-output-investigation.rst @@ -0,0 +1,238 @@ +Z-Machine Garbled Output Investigation +======================================== + +Context +------- + +Branch: ``zmachine-zork1-integration`` + +The hybrid z-machine interpreter runs Zork 1 but produces garbled output: + +- Room name "West of House" prints correctly +- Room description shows "er " instead of full text +- After "look" command, output starts correctly then becomes garbage with ? characters +- "There is a small mailbox here." prints correctly + +Investigation +------------- + +The z-string decoder (ZStringTranslator, ZCharTranslator, ZsciiTranslator in +zstring.py) is correct. The abbreviation table loads correctly. All 96 +abbreviations decode to the right text. + +The diagnostic script (``scripts/debug_zstrings.py``) revealed the problem: + +- ``op_print_paddr paddr=0x0051 (byte=0x00a2)`` decodes to "er " +- Address 0x00a2 is in the abbreviation table area, not where the room + description should be +- The z-string at 0x00a2 really is "er " - the decoder is right, the address + is wrong + +Root cause: bugs in ``zobjectparser.py`` property parsing. + +Bugs Found +---------- + +All bugs are in ``zobjectparser.py``. + +Bug 1: Off-by-one in shortname skip (ROOT CAUSE) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In ``get_prop_addr_len()`` line 392:: + + addr += 2 * self._memory[addr] + +This skips the shortname data but not the length byte. Compare with +``get_all_properties()`` which does it correctly:: + + shortname_length = self._memory[addr] + addr += 1 # skip length byte + addr += 2 * shortname_length # skip shortname data + +The off-by-one means the property table scan starts 1 byte too early, reading +the last byte of the shortname z-string as if it were a property header. This +corrupts every subsequent property lookup. + +Bug 2: BitField slice wrong width for V3 property number +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In ``get_prop_addr_len()`` line 399 and ``get_all_properties()`` line 446:: + + pnum = bf[4:0] + +BitField slice ``[4:0]`` has start > stop, so it swaps to ``[0:4]`` = 4 bits. +But V3 property numbers are 5 bits (bits 0-4). Should be ``bf[0:5]``. + +This means property numbers 16-31 are read as 0-15 (bit 4 is lost). + +Bug 3: BitField slice wrong width for V3 property size +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In ``get_prop_addr_len()`` line 400, ``get_all_properties()`` line 447, and +``get_property_length()`` line 555:: + + size = bf[7:5] + 1 + +BitField slice ``[7:5]`` swaps to ``[5:7]`` = 2 bits. But V3 property sizes are +3 bits (bits 5-7). Should be ``bf[5:8]``. + +This means properties with size > 4 bytes are read with the wrong size (e.g., +size 5 reads as size 1), causing the property iterator to skip too few bytes. + +Bug 4: Same BitField slice issue in V5 path +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In ``get_prop_addr_len()`` lines 409, 413:: + + pnum = bf[5:0] # swaps to [0:5] = 5 bits, should be [0:6] = 6 bits + size = bf2[5:0] # swaps to [0:5] = 5 bits, should be [0:6] = 6 bits + +Compare with ``get_all_properties()`` V5 path which correctly uses ``bf[0:6]``. + +Also in ``get_property_length()`` line 563: same ``bf2[5:0]`` issue. + +How to Fix +---------- + +All fixes are in ``zobjectparser.py``: + +1. Line 392: change ``addr += 2 * self._memory[addr]`` to:: + + addr += 1 + 2 * self._memory[addr] + +2. Lines 399, 446: change ``bf[4:0]`` to ``bf[0:5]`` + +3. Lines 400, 447, 555: change ``bf[7:5]`` to ``bf[5:8]`` + +4. Lines 409, 413, 563: change ``bf[5:0]`` to ``bf[0:6]`` + +The pattern: BitField uses Python slice semantics ``[start:stop)`` where the +number of bits extracted is ``stop - start``. The buggy code used ``[high:low]`` +notation thinking it would extract bits high through low inclusive, but the swap +logic makes it extract fewer bits than intended. + +This is the same class of bug as the write_global fix in commit e5329d6 +(BitField slice extracting 7 bits not 8). + +Diagnostic Tools +---------------- + +``scripts/debug_zstrings.py`` traces all z-string decoding with caller info. + +Run with:: + + echo -e "look\nquit\ny" | python3 scripts/debug_zstrings.py 2>/tmp/trace.log + +Trace goes to stderr, game output to stdout. + +Session 2: Additional Bugs Found and Fixed +------------------------------------------- + +Bug 5: Same off-by-one in get_next_property +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``get_next_property()`` had the same shortname skip bug as Bug 1:: + + addr += 2 * self._memory[addr] + +Same fix: ``addr += 1 + 2 * self._memory[addr]`` + +Also had the same BitField slice bugs (Bugs 2, 4) for the V3/V5 +property number return values. + +Bug 6: get_property_length reads wrong byte for V5 two-byte headers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In the V5 two-byte property header path, the code read the size from +the first header byte (which contains the property number) instead of +the second header byte (which contains the size):: + + bf2 = BitField(self._memory[data_address - 2]) # WRONG: first byte + size = bf2[0:6] + +The second byte was already read into ``bf`` at ``data_address - 1``. +Fix: use ``bf[0:6]`` directly instead of creating ``bf2``. + +Bug 7: op_dec_chk and op_inc_chk use unsigned comparison +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In ``zcpu.py``, both ``op_dec_chk`` and ``op_inc_chk`` compared 16-bit +values as unsigned integers instead of signed. The Z-machine spec +requires signed comparison. + +Effect: when a counter decremented from 0 to 65535 (wrapping), the check +``65535 < 0`` evaluated false (unsigned), causing infinite loops in Z-code +that counted down past zero. This made Zork 1's word-printing loop run +forever, dumping raw memory bytes to the screen as characters. + +Fix: wrap both operands with ``_make_signed()`` before comparing. + +What's Fixed Now +----------------- + +After all fixes: + +- Room descriptions display correctly on startup +- The infinite garbage output loop is fixed (dec_chk now works with signed) +- ``just check`` passes (516 tests) + +What's Still Broken +-------------------- + +Commands like "look", "open mailbox", "quit" all produce:: + + You used the word "look" in a way that I don't understand. + +The parser finds words in the dictionary (dict_addr is correct, e.g. +"look" at 0x44f2) and stores them in the parse buffer correctly. But the +game's Z-code parser can't match them to valid verb syntax. + +Instruction tracing after ``op_sread`` shows the game's parser: + +1. Reads parse buffer correctly (loadw(0x2551, 1) -> dict_addr 17650) +2. Checks dict_addr against specific word addresses (je comparisons) - + these are special-word checks, none match +3. Calls a function with the dict_addr that reads ``loadb(17650, 4)`` to + get the word's flags byte (0x41) from the dictionary entry +4. Tests individual flag bits (16, 64, 8, 32, 128, 4) against 0x41 +5. Bit 6 (64) matches - the function extracts data from byte 5 (0xc2) +6. Despite finding the word type, the parser still rejects the command + +The bug is likely in how the parser uses the extracted word-type data to +look up verb syntax. Possible areas: + +- The ``and`` / ``je`` sequence that extracts a sub-field from the flags + byte may produce the wrong value +- The verb number extracted from byte 5 (0xc2 = 194) may need signed + interpretation or bit masking +- Another opcode (``op_jl``, ``op_jg``, ``op_sub``) may have a similar + unsigned-vs-signed bug +- The ``call`` opcode may have incorrect return value handling + +Debugging approach for next session: + +1. Extend the instruction trace past 100 steps to see what happens after + the word-type function returns +2. Compare the execution trace against dfrotz instruction trace (add + ``-t`` flag to dfrotz for tracing) +3. Check ``op_jl`` and ``op_jg`` for the same unsigned comparison bug +4. Check ``op_sub`` and ``op_mul`` for signed arithmetic issues +5. Look at how the ``and`` result (step 21 in trace) flows through to + the verb syntax lookup + +Diagnostic Tools +---------------- + +``scripts/debug_zstrings.py`` traces all z-string decoding with caller info. + +Run with:: + + echo -e "look\nquit\ny" | python3 scripts/debug_zstrings.py 2>/tmp/trace.log + +Trace goes to stderr, game output to stdout. + +To add instruction tracing after sread, add to ``zcpu.py``:: + + # In step(), after decoding: + print(f"PC={pc:#06x} {handler_name}({', '.join(str(a) for a in args)})", + file=sys.stderr) diff --git a/scripts/debug_zstrings.py b/scripts/debug_zstrings.py new file mode 100755 index 0000000..d9c9ebc --- /dev/null +++ b/scripts/debug_zstrings.py @@ -0,0 +1,189 @@ +#!/usr/bin/env python3 +"""Diagnostic script to trace z-string decoding in Zork 1. + +This script loads Zork 1 and traces the z-string decoding pipeline: +1. Dumps all abbreviations from the abbreviation table +2. Monkey-patches ZStringFactory.get() to log every string decode +3. Runs the game with piped input to capture string decoding during gameplay +""" + +import inspect +import sys +from pathlib import Path + + +def main(): + # Add src to path so we can import mudlib + project_root = Path(__file__).parent.parent + sys.path.insert(0, str(project_root / "src")) + + from mudlib.zmachine import ZMachine + from mudlib.zmachine.trivialzui import create_zui + from mudlib.zmachine.zstring import ZStringFactory + + print("=" * 80) + print("Z-STRING DECODING DIAGNOSTIC FOR ZORK 1") + print("=" * 80) + print() + + story_path = project_root / "content" / "stories" / "zork1.z3" + + if not story_path.exists(): + print(f"ERROR: Story file not found at {story_path}") + sys.exit(1) + + with open(story_path, "rb") as f: + story_bytes = f.read() + + print(f"Loaded {len(story_bytes)} bytes from {story_path.name}") + print() + + # Create ZUI and ZMachine + ui = create_zui() + zmachine = ZMachine(story_bytes, ui, debugmode=False) + + # Access the string factory + string_factory = zmachine._stringfactory + + # PART 1: Dump all abbreviations + print("=" * 80) + print("ABBREVIATION TABLE DUMP") + print("=" * 80) + print() + + abbrevs = string_factory.zchr._abbrevs + if not abbrevs: + print("No abbreviations found (version 1 game or failed to load)") + else: + print(f"Total abbreviations: {len(abbrevs)}") + print() + + for table in range(3): + print(f"--- Table {table} ---") + for index in range(32): + key = (table, index) + if key in abbrevs: + zscii_codes = abbrevs[key] + text = string_factory.zscii.get(zscii_codes) + # Show ZSCII codes (truncate if too long) + if len(zscii_codes) > 20: + zscii_display = str(zscii_codes[:20]) + "..." + else: + zscii_display = str(zscii_codes) + # Show text (truncate if too long) + if len(text) > 60: + text_display = repr(text[:60]) + "..." + else: + text_display = repr(text) + print(f" [{table},{index:2d}] ZSCII={zscii_display}") + print(f" Text={text_display}") + print() + + # PART 2: Monkey-patch ZStringFactory.get() to trace all string decodes + print("=" * 80) + print("STRING DECODING TRACE") + print("=" * 80) + print() + print("Format: [opcode args] -> addr") + print(" zchars -> zscii -> text") + print() + + def traced_get(self, addr): + # Get caller information using inspect + stack = inspect.stack() + caller_name = "unknown" + caller_info = "" + + # Walk up the stack to find the opcode function + for frame_info in stack[1:]: # Skip our own frame + func_name = frame_info.function + if func_name.startswith("op_"): + caller_name = func_name + # Try to extract arguments from the frame + frame_locals = frame_info.frame.f_locals + # For op_print_addr, show the addr argument + if func_name == "op_print_addr" and "addr" in frame_locals: + caller_info = f" addr={frame_locals['addr']:#06x}" + # For op_print_paddr, show packed address and conversion + elif func_name == "op_print_paddr" and "string_paddr" in frame_locals: + paddr = frame_locals["string_paddr"] + # Show both the packed address and what it converts to + # For z3, byte_addr = paddr * 2 + byte_addr = paddr * 2 + caller_info = f" paddr={paddr:#06x} (byte={byte_addr:#06x})" + # For op_print_obj, show the object number + elif func_name == "op_print_obj" and "obj" in frame_locals: + caller_info = f" obj={frame_locals['obj']}" + # For op_print, note it's inline (z-string follows opcode) + elif func_name == "op_print": + caller_info = " (inline z-string)" + break + + # Get z-chars from ZStringTranslator + zchars = self.zstr.get(addr) + + # Convert z-chars to ZSCII codes + zscii_codes = self.zchr.get(zchars) + + # Convert ZSCII to Unicode text + text = self.zscii.get(zscii_codes) + + # Log to stderr so it doesn't interfere with game output + # Truncate long lists/strings for readability + if len(zchars) > 20: + zchars_display = str(zchars[:20]) + f"... (len={len(zchars)})" + else: + zchars_display = str(zchars) + + if len(zscii_codes) > 20: + zscii_display = str(zscii_codes[:20]) + f"... (len={len(zscii_codes)})" + else: + zscii_display = str(zscii_codes) + + text_display = repr(text[:80]) + "..." if len(text) > 80 else repr(text) + + print(f"[{caller_name}{caller_info}] -> {addr:#06x}", file=sys.stderr) + print(f" zchars={zchars_display}", file=sys.stderr) + print(f" zscii={zscii_display}", file=sys.stderr) + print(f" text={text_display}", file=sys.stderr) + print(file=sys.stderr) + + return text + + # Apply the monkey patch + ZStringFactory.get = traced_get # type: ignore[assignment] + + # PART 3: Run the game with piped input + print("=" * 80) + print("GAME OUTPUT (running interpreter with tracing enabled)") + print("=" * 80) + print() + print( + "Note: String decode traces are written to stderr and " + "interleaved with game output." + ) + print("To separate them, run: python3 scripts/debug_zstrings.py 2>trace.log") + print() + print( + "To see just the trace: " + "python3 scripts/debug_zstrings.py 2>&1 >/dev/null | grep zchars" + ) + print() + print("Press Ctrl+C to stop the game.") + print() + + try: + zmachine.run() + except KeyboardInterrupt: + print("\n[Interrupted by user]") + except Exception as e: + print(f"\n[Game ended: {type(e).__name__}: {e}]") + + print() + print("=" * 80) + print("DIAGNOSTIC COMPLETE") + print("=" * 80) + + +if __name__ == "__main__": + main()