Disassembly DIY

sna2skool.py

SkoolKit includes a utility script called sna2skool.py that can be used to convert a 48K SNA or Z80 snapshot into a skool file. To try it out, copy a Z80 snapshot of Skool Daze (call it skool_daze.z80) to the directory containing SkoolKit, and then run the following command from that directory:

$ ./sna2skool.py skool_daze.z80 > sd-blank.skool

Now take a look at sd-blank.skool. As you can see, by default, sna2skool.py disassembles everything from 16384 to 65535, treating it all as code. Needless to say, this is not particularly useful - unless you have no idea where the code and data blocks are yet, and want to use this disassembly to find out.

Once you have figured out where the code and data blocks are, it would be handy if you could supply sna2skool.py with this information, so that it can disassemble the blocks accordingly. That is where the control file comes in.

The control file

A control file contains a list of start addresses of code and data blocks. Each address is marked with a ‘control directive’, which is a single letter that indicates what the block contains:

  • b indicates a data block
  • c indicates a code block
  • g indicates a game status buffer entry
  • i indicates a block that should be ignored
  • t indicates a block containing text
  • u indicates an unused block of memory
  • w indicates a block containing words (two-byte values)
  • z indicates an unused block containing all zeroes

(If these letters remind you of the valid characters that may appear in the first column of each line of a skool file, that is no coincidence.)

For example:

c 24576 Do stuff
b 24832 Important data
t 25088 Interesting messages
u 25344 Unused

This control file declares that:

  • Everything before 24576 should be ignored
  • There is a routine at 24576-24831 which should be titled ‘Do stuff’
  • There is data at 24832-25087
  • There is text at 25088-25343
  • Everything from 25344 onwards is unused (but should still be disassembled as data)

A skeleton skool disassembly

So if we had a control file for Skool Daze, we could produce a much more useful skool file. As it happens, SkoolKit includes one: skool_daze.ctl. You can use it with sna2skool.py thus:

$ ./sna2skool.py -c ctl/skool_daze.ctl skool_daze.z80 > sd-blank.skool

This time, sd-blank.skool is split up into meaningful blocks, with code as code, data as data (DEFBs), and text as text (DEFMs). Much nicer.

The next step is to create an HTML disassembly from this skool file:

$ ./skool2html.py -f sd-blank.skool html

Now open html/sd-blank/index.html in a web browser. There’s not much there, but it’s a base from which you can start adding comments to sd-blank.skool (remembering to use skool macros where appropriate to insert hyperlinks and images) and create your own Skool Daze disassembly.

To replace the word ‘sd-blank’ in the page titles, we need to give the game a name. This can be done by creating a ref file called sd-blank.ref that contains the following lines:

[Game]
Game=Skool Daze

Then run skool2html.py again to re-generate the HTML. Alternatively, you could create a game logo image (in PNG format) and copy it to html/sd-blank/images/logo.png; the image will be used instead of the game name if it is present.

To create a skeleton skool disassembly for Back to Skool too, use the supplied back_to_skool.ctl file with a snapshot of the game (call it back_to_skool.z80):

$ ./sna2skool.py -c ctl/back_to_skool.ctl back_to_skool.z80 > bts-blank.skool
$ ./skool2html.py -f bts-blank.skool html

Open html/bts-blank/index.html to start browsing this newborn disassembly.

Generating a control file

If you are planning to create a disassembly of some game other than Skool Daze or Back to Skool, you will need to create your own control file. To get started, you can use the -g option with sna2skool.py to perform a rudimentary static code analysis of the snapshot file and generate a corresponding control file:

$ ./sna2skool.py -g game.ctl game.z80 > game.skool

This will do a reasonable job of splitting the snapshot into blocks, but you will need to examine the resultant skool file (game.skool in this case) to see which blocks should be marked as text or data instead of code, and then edit the generated control file (game.ctl) accordingly.

Blocks whose contents resemble text are given a title like this:

Routine/text? at 26836

Blocks whose contents resemble neither code nor text are given a title like this:

Routine/data? at 26624

Any other blocks are assumed to contain code and are given a title like this:

Routine at 24576

Extended control file syntax

Besides the declaration of block types, addresses and titles, the control file syntax also supports the declaration of the following things:

  • Block descriptions
  • Register values
  • Mid-block comments
  • Block end comments
  • Sub-block types and comments

To provide a description for a code block at 24576 (for example), use the D directive thus:

c 24576 This is the title of the routine at 24576
D 24576 This is the description of the routine at 24576.

To declare the values of the registers upon entry to the routine at 24576, add one line per register with the R directive thus:

R 24576 A An important value in the accumulator
R 24576 DE Display file address

To declare a mid-block comment that will appear above the instruction at 24592, use the D directive thus:

D 24592 The next section of code does something really important.

To declare a comment that will appear at the end of the routine at 24576, use the E directive thus:

E 24576 And so the work of this routine is done.

Sometimes a block marked as one type (code, data, text, or whatever) may contain instructions or statements of another type. For example, a word (w) block may contain the odd non-word here and there. To declare such sub-blocks whose type does not match that of the containing block, use the following syntax:

w 32768 A block containing mostly words
B 32800,3 But here's a sub-block of 3 bytes at 32800
T 32809,8 And an 8-byte text string at 32809
C 32821,10 And 10 bytes of code at 32821 too?

The directives (B, T and C) used here to mark the sub-blocks are the upper case equivalents of the directives used to mark top-level blocks (b, t and c). The comments at the end of these sub-block declarations are taken as instruction-level comments and will appear as such in the resultant skool file.

Three bits of extended syntax left. First, the blank sub-block directive:

c 24576 A great routine
  24580,11 A great section of code at 24580

This is equivalent to:

c 24576 A great routine
C 24580,11 A great section of code at 24580

That is, the the type of a blank sub-block directive is taken to be the same as that of the parent block.

Next, the address range:

c 24576 A great routine
  24580-24590 A great section of code at 24580

This is equivalent to:

c 24576 A great routine
  24580,11 A great section of code at 24580

That is, you can specify the extent of a sub-block using either an address range, or an address and a length.

Finally, the implicit sub-block extent:

c 24576 A great routine
  24580 A great section of code at 24580
  24588,10 Another great section of code at 24590

This is equivalent to:

c 24576 A great routine
  24580,8 A great section of code at 24580
  24588,10 Another great section of code at 24588

But the declaration of the length (8) of the sub-block at 24580 is redundant, because the sub-block is implicitly terminated by the declaration of the sub-block at 24588 that follows. This is exactly how top-level block declarations work: each top-level block is implicitly terminated by the declaration of the next one.

Other disassembly options

If you know that there is nothing of interest in the snapshot before address 24576 (for example), you can tell sna2skool.py to start disassembling from that address (instead of 16384) by using the -s option:

$ ./sna2skool.py -s 24576 -g game.ctl game.z80 > game.skool

To make it easier to find messages (strings) in the snapshot, use the -t option:

$ ./sna2skool.py -t -g game.ctl game.z80 > game.skool

This will add to the comment field, for each line of the disassembly, the ASCII equivalent of the disassembled bytes.

Other ready-made control files

SkoolKit includes some other ready-made control files (besides skool_daze.ctl and back_to_skool.ctl):

  • contact_sam_cruise.ctl (for Contact Sam Cruise)
  • manic_miner.ctl (for the Bug Byte version of Manic Miner)

These control files may not be complete or entirely accurate, but they are a better starting point than an empty or generated control file.

The incomplete Contact Sam Cruise RAM disassembly

To create an HTML disassembly of Contact Sam Cruise using the supplied control file, first copy a snapshot of that game (call it csc.z80) to the SkoolKit directory, and then:

$ ./sna2skool.py -c ctl/contact_sam_cruise.ctl csc.z80 > csc.skool
$ ./skool2html.py -f csc.skool html

Now open html/contact_sam_cruise/index.html in a web browser.

Note that the file csc.ref (in the src subdirectory) is used to supply extra information to the disassembly (such as the logo image and animatory state descriptions).

Adding pokes, bugs, trivia and a glossary

Adding ‘Pokes’, ‘Bugs’, ‘Trivia’ and ‘Glossary’ pages to a disassembly is done by adding Poke, Bug, Fact and Glossary sections to the ref file. For any such sections that are present, skool2html.py will add links to the disassembly index page.

For example, let’s add a poke. Add the following lines to sd-blank.ref (or bts-blank.ref if you’re playing with the skeleton Back to Skool disassembly):

[Poke:greatPoke:Great POKE]
The following POKE is great:

POKE 45678,9

Now run skool2html.py again:

$ ./skool2html.py -f sd-blank.skool html

Open html/sd-blank/index.html and you should see a link to the ‘Pokes’ page in the ‘Reference’ section.

The format of a Poke section is:

[Poke:anchor:Title]
First paragraph.

Second paragraph.

...

where:

  • anchor is the name of the anchor for the entry on the ‘Pokes’ page
  • Title is the title of the entry

Paragraphs should be separated by blank lines.

The format of a Bug or Fact section is the same, except that the section name prefix is Bug: or Fact: (instead of Poke:) as appropriate.

One Poke, Bug or Fact section should be added for each poke, bug or trivia item to be documented. Entries will appear on the ‘Pokes’, ‘Bugs’ or ‘Trivia’ page in the same order as the sections appear in the ref file.

The format of a Glossary section is slightly different:

[Glossary:term]
Description

The description should be a single paragraph. For example:

[Glossary:Chuntable]
Likely to be affected by a disturbance in the chuntey (q.v.)

skool v. ASM

If, instead of a skool file, you’d rather create an assembler-ready ASM file from a snapshot, sna2skool.py can do that with the -a option:

$ ./sna2skool.py -c ctl/skool_daze.ctl -a skool_daze.z80 > sd-blank.asm

However, in general it is a better idea to use sna2skool.py to create a skool file first, and then use skool2asm.py to convert the skool file to an ASM file (because skool2asm.py provides many formatting options that sna2skool.py doesn’t, and it will automatically reduce skool macros).

skool2ctl.py

SkoolKit includes a utility script called skool2ctl.py that can be used to convert a skool file into a control file. For example:

$ ./skool2ctl.py src/sd.skool > sd.ctl

In addition to block types and addresses, sd.ctl will contain block titles, block descriptions, registers, mid-block comments, block end comments, sub-block types and addresses, and instruction-level comments. However, note that skool and asm directives are lost in the conversion.

skool2ctl.py supports some options to control the amount of information that is included in the control file; run it with no arguments to see a list:

-g ID  Set the game ID (e.g. 'sd', 'bts'; default is blank)
-wX    Write only these elements, where X is one or more of:
         b = block types and addresses
         t = block titles
         d = block descriptions
         r = registers
         m = mid-block comments and block end comments
         s = sub-block types and addresses
         c = instruction-level comments

bin2tap.py

SkoolKit includes a utility script called bin2tap.py that can be used to convert a binary file produced by an assembler (see Supported assemblers) into a TAP file that can be loaded into an emulator. For example:

$ ./bin2tap.py game.bin

will create a file called game.tap. By default, the origin address (the address of the first byte of code or data) and the start address (the first byte of code to run) are set to 65536 minus the length of game.bin. These defaults can be changed by passing options to bin2tap.py. Run it with no arguments to see the list of available options:

Usage: ./bin2tap.py [options] file.bin

Options:
  -o ORG      Set the origin (default: 65536 - length of file.bin)
  -s START    Set the start address to JP to (default: ORG)
  -t TAPFILE  Set the TAP filename (default: file.tap)