Scratchpad

If you are new to Scratchpad, and want full access as a Scratchpad editor, create an account!
If you already have an account, log in and have fun!!

READ MORE

Scratchpad
Advertisement

The comp.sys.sinclair FAQ has documentation for many file formats used by Spectrum emulators.

Disk image formats

+3 DSK format

The official +3 DSK specification can be found at Kevin Thacker's site.

+D / DISCiPLE DSK format

'DSK' as used for MGT (+D / DISCiPLE) disks is simply a raw disk image with 10 tracks per sector.

HDF format

HDF files are used to store hard-disk data into image files for emulation purposes. They consist in a file header, followed by a raw dump of the tracks data.

The following is the format of the HDF header. All numbers are hexadecimal and in little endian (Intel x86) byte order.

Version 1.0

Offset   Len   Meaning
-----------------------------------------------------------------------------
    00    06   "RS-IDE"
    06    01   0x1A
    07    01   Revision number (BCD): 0x10 (v1.0)
    08    01   b0: halved sector data (only LSB of sector words is stored)
    09    02   offset of hard-disk data (0x0080)
    0B    0B   reserved (MUST be set to 0x00)
    16    6A   first 53 words (106 bytes) of IDE/ATA identification data,
               as returned by ATA command 0xEC

word[09]  ??   raw hard-disk data (C0 H0, C0 H1 ... C0 H15, C1 H0, C1 H1 ...)

Version 1.1

This differs from version 1.0 in that the full 512-byte (256 word) identification data packet is included in the header.

Offset   Len   Meaning
-----------------------------------------------------------------------------
    00    06   "RS-IDE"
    06    01   0x1A
    07    01   Revision number (BCD): 0x11 (v1.1)
    08    01   b0: halved sector data (only LSB of sector words is stored)
    09    02   offset of hard-disk data (0x0216)
    0B    0B   reserved (MUST be set to 0x00)
    16   200   IDE/ATA identification data, as returned by ATA command 0xEC

word[09]  ??   raw hard-disk data (C0 H0, C0 H1 ... C0 H15, C1 H0, C1 H1 ...)

Note: IDE devices transfer data in 16-bit words. Since the Z80 data bus is only 8-bit, so some IDE adapters use additional logic to split the IDE word into two bytes so that the Z80 can fetch them. However, other adapters discard the most significant byte of the word completely, in favour of a simplified circuitry; in this case, only half of the nominal capacity of a disk sector is used. Bit 0 at offset 0x08 is introduced to indicate this: when it is set, it means that the sector size specified by the IDE identification data is actually halved in the HDF file. This is done to reduce the HDF file size, by storing only the "usable" significant data; for all the supported adapters, the least significant byte is stored.

The IDE identification data format is reported into any IDE/ATA technical paper. It contains information about the drive geometry (cylinders, heads, sectors, sector size), the device model name, the supported features and so on.

Tape image formats

TAP format

The .TAP files contain blocks of tape-saved data. All blocks start with two bytes specifying how many bytes will follow (not counting the two length bytes). Then raw tape data follows, including the flag and checksum bytes. The checksum is the bitwise XOR of all bytes including the flag byte. For example, when you execute the line SAVE "ROM" CODE 0,2 this will result:

            |------ Spectrum-generated data -------|       |---------|

      13 00 00 03 52 4f 4d 7x20 02 00 00 00 00 80 f1 04 00 ff f3 af a3

      ^^^^^...... first block is 19 bytes (17 bytes+flag+checksum)
            ^^... flag byte (A reg, 00 for headers, ff for data blocks)
               ^^ first byte of header, indicating a code block

      file name ..^^^^^^^^^^^^^
      header info ..............^^^^^^^^^^^^^^^^^
      checksum of header .........................^^
      length of second block ........................^^^^^
      flag byte ............................................^^
      first two bytes of rom .................................^^^^^
      checksum (checkbittoggle would be a better name!).............^^

Note that it is possible to join .TAP files by simply stringing them together; for example, in DOS / Windows: COPY /B FILE1.TAP + FILE2.TAP ALL.TAP ; or in Unix/Linux: cp file1.tap all.tap && cat file2.tap >> all.tap

For completeness, I'll include the structure of a tape header. A header always consists of 17 bytes:

Byte Length Description
0 1 Type (0,1,2 or 3)
1 10 Filename (padded with blanks)
11 2 Length of data block
13 2 Parameter 1
15 2 Parameter 2

The type is 0,1,2 or 3 for a Program, Number array, Character array or Code file. A SCREEN$ file is regarded as a Code file with start address 16384 and length 6912 decimal. If the file is a Program file, parameter 1 holds the autostart line number (or a number >=32768 if no LINE parameter was given) and parameter 2 holds the start of the variable area relative to the start of the program. If it's a Code file, parameter 1 holds the start of the code block when saved, and parameter 2 holds 32768. For data files finally, the byte at position 14 decimal holds the variable name.

(originally from TECHINFO.DOC supplied with Z80 by Gerton Lunter)

TZX format

The .TZX format also specified files that contain blocks of tape-saved data with various high-level abstractions of tape encoding schemes for handling a wider range of loaders than the .TAP format. The official TZX specification can be found at World of Spectrum.

De-facto TZX format conventions

The TZX format specification says that text fields should exclusively use ASCII symbols. Over time, this has been found to fall short of what has been required in creating Archive info blocks which at least require the pound and Euro currency symbols as well as accented characters (for European names) and should probably also extend to at least Cyrillic text in the future to accommodate software from Eastern Europe.

While the 1.20 TZX format updates were being drafted it was agreed that as a practical first step that the string encoding would be redefined to be ISO Latin 1 (also known as ISO 8859-1) which would formalise the encoding used for the pound sign being used by World of Spectrum and allow the use of many more accented characters at the same time. Unfortunately this text was accidentally omitted from the final document leaving the change as an informal extension.

Since then there have been several ZX Spectrum software releases sold in Euros which has created a desire to support that symbol in the Archive info blocks as well. This is missing from the ISO Latin 1 character set so a new approach was required to achieve the goal.

A discussion on the World of Spectrum forums considered some options on how this could be accommodated without too much disruption to existing software and tools. The result of the discussion was that World of Spectrum has moved to using the Windows code page 1252 character mapping for the Euro symbol in its Archive info blocks from the Infoseek tool and that character mapping should be considered a de-facto standard for TZX files in any tool wanting to correctly interpret blocks from this source.

Hopefully a future revision of the TZX file format will formalise this usage and endorse the use of UTF-8 in the future to support the remaining characters missing from Windows code page 1252.

PZX format

The PZX format is another file format for efficient storage of tape-saved data designed to be simpler to support in utilities while retaining the important features of the TZX format. The official PZX specification can be found on its own home page .

Snapshot formats

ZX-State (SZX) format

See the official ZX-State documentation. Implemented by libspectrum (used by Fuse), Zero, SpecEmu and Spin. Early versions of libspectrum (prior to 1.0.0) contained a bug which caused the A and A' registers to be swapped with F and F', respectively, when loading or saving in the SZX format. Current versions of libspectrum detect "libspectrum: 0.5.0" and earlier versions strings and correct for this.

Advertisement