Scratchpad

If you are new to Scratchpad, and want full access as a Scratchpad editor, create an account!
If you already have an account, log in and have fun!!

READ MORE

Scratchpad
({{SinclairFAQSection}} for tap and szx)
Line 56: Line 56:
 
=== TAP format (and variants) ===
 
=== TAP format (and variants) ===
   
  +
{{SinclairFAQSection|TAP format|TAP_format}}
 
The .TAP files contain blocks of tape-saved data. All blocks start with two bytes specifying how many bytes will follow (not counting the two length bytes). Then raw tape data follows, including the flag and checksum bytes. The checksum is the bitwise XOR of all bytes including the flag byte. For example, when you execute the line SAVE "ROM" CODE 0,2 this will result:
 
The .TAP files contain blocks of tape-saved data. All blocks start with two bytes specifying how many bytes will follow (not counting the two length bytes). Then raw tape data follows, including the flag and checksum bytes. The checksum is the bitwise XOR of all bytes including the flag byte. For example, when you execute the line SAVE "ROM" CODE 0,2 this will result:
   
Line 107: Line 108:
   
 
''(originally from TECHINFO.DOC supplied with Z80 by Gerton Lunter)''
 
''(originally from TECHINFO.DOC supplied with Z80 by Gerton Lunter)''
 
==== SPC format ====
 
The block length stored in the SPC format is two less than stored in TAP, and the parity byte in SPC does not include the flag byte in its calculation. The format is otherwise identical to TAP. Used by the SP emulator (under DOS) by J. Swiatek and K. Makowski.
 
 
==== STA format ====
 
The block length stored in the STA format is two less than stored in TAP, and the parity byte is not stored at all. The format is otherwise identical to TAP. Used by Speculator (under RISC OS) by Dave Lawrence. Documented in [http://mdfs.net/Docs/Comp/Spectrum/FileFormat/2_Tape J. G. Harston's tape formats document].
 
 
==== LTP format ====
 
The block length stored in the LTP format is two less than stored in TAP. The format is otherwise identical to TAP. Used by Nuclear ZX (under DOS) by Radovan Garabik and Lubomir Salanci. Documented as part of the [http://kassiopeia.juls.savba.sk/~garabik/old/readme.txt Nuclear ZX documentation].
 
 
=== TZX format ===
 
The .TZX format also specified files that contain blocks of tape-saved data with various high-level abstractions of tape encoding schemes for handling a wider range of loaders than the .TAP format. The official TZX specification can be found at [http://www.worldofspectrum.org/TZXformat.html World of Spectrum].
 
   
 
==== De-facto TZX format conventions ====
 
==== De-facto TZX format conventions ====
Line 138: Line 127:
   
 
=== ZX-State (SZX) format ===
 
=== ZX-State (SZX) format ===
  +
{{SinclairFAQSection|ZX-State format|ZX-State_format}}
See the [http://www.spectaculator.com/docs/zx-state/intro.shtml official ZX-State documentation] and also the [http://www.spectaculator.com/docs/svn/zx-state/intro.shtml draft of the next version]. Implemented by libspectrum (used by Fuse), Zero, SpecEmu and Spin. Early versions of libspectrum (prior to 1.0.0) contained a bug which caused the A and A' registers to be swapped with F and F', respectively, when loading or saving in the SZX format. Current versions of libspectrum detect "libspectrum: 0.5.0" and earlier versions strings and correct for this.
 
  +
Moved to Sinclair FAQ Wiki by original author with no modifications by other parties.
 
[[Category:ZX Spectrum technical information]]
 
[[Category:ZX Spectrum technical information]]

Revision as of 19:36, 6 April 2014

The comp.sys.sinclair FAQ has documentation for many file formats used by Spectrum emulators.

Disk image formats

+3 DSK format

The official +3 DSK specification can be found at Kevin Thacker's site.

+D / DISCiPLE DSK format

'DSK' as used for MGT (+D / DISCiPLE) disks is simply a raw disk image with 10 tracks per sector.

HDF format

HDF files are used to store hard-disk data into image files for emulation purposes. They consist in a file header, followed by a raw dump of the tracks data.

The following is the format of the HDF header. All numbers are hexadecimal and in little endian (Intel x86) byte order.

Version 1.0

Offset   Len   Meaning
-----------------------------------------------------------------------------
    00    06   "RS-IDE"
    06    01   0x1A
    07    01   Revision number (BCD): 0x10 (v1.0)
    08    01   b0: halved sector data (only LSB of sector words is stored)
    09    02   offset of hard-disk data (0x0080)
    0B    0B   reserved (MUST be set to 0x00)
    16    6A   first 53 words (106 bytes) of IDE/ATA identification data,
               as returned by ATA command 0xEC

word[09]  ??   raw hard-disk data (C0 H0, C0 H1 ... C0 H15, C1 H0, C1 H1 ...)

Version 1.1

This differs from version 1.0 in that the full 512-byte (256 word) identification data packet is included in the header.

Offset   Len   Meaning
-----------------------------------------------------------------------------
    00    06   "RS-IDE"
    06    01   0x1A
    07    01   Revision number (BCD): 0x11 (v1.1)
    08    01   b0: halved sector data (only LSB of sector words is stored)
    09    02   offset of hard-disk data (0x0216)
    0B    0B   reserved (MUST be set to 0x00)
    16   200   IDE/ATA identification data, as returned by ATA command 0xEC

word[09]  ??   raw hard-disk data (C0 H0, C0 H1 ... C0 H15, C1 H0, C1 H1 ...)

Note: IDE devices transfer data in 16-bit words. Since the Z80 data bus is only 8-bit, so some IDE adapters use additional logic to split the IDE word into two bytes so that the Z80 can fetch them. However, other adapters discard the most significant byte of the word completely, in favour of a simplified circuitry; in this case, only half of the nominal capacity of a disk sector is used. Bit 0 at offset 0x08 is introduced to indicate this: when it is set, it means that the sector size specified by the IDE identification data is actually halved in the HDF file. This is done to reduce the HDF file size, by storing only the "usable" significant data; for all the supported adapters, the least significant byte is stored.

The IDE identification data format is reported into any IDE/ATA technical paper. It contains information about the drive geometry (cylinders, heads, sectors, sector size), the device model name, the supported features and so on.

Tape image formats

TAP format (and variants)

This section has been moved or is in the process of being moved to the Sinclair FAQ Wiki, under the "TAP format" article. You may find more complete information there.

The .TAP files contain blocks of tape-saved data. All blocks start with two bytes specifying how many bytes will follow (not counting the two length bytes). Then raw tape data follows, including the flag and checksum bytes. The checksum is the bitwise XOR of all bytes including the flag byte. For example, when you execute the line SAVE "ROM" CODE 0,2 this will result:

            |------ Spectrum-generated data -------|       |---------|

      13 00 00 03 52 4f 4d 7x20 02 00 00 00 00 80 f1 04 00 ff f3 af a3

      ^^^^^...... first block is 19 bytes (17 bytes+flag+checksum)
            ^^... flag byte (A reg, 00 for headers, ff for data blocks)
               ^^ first byte of header, indicating a code block

      file name ..^^^^^^^^^^^^^
      header info ..............^^^^^^^^^^^^^^^^^
      checksum of header .........................^^
      length of second block ........................^^^^^
      flag byte ............................................^^
      first two bytes of rom .................................^^^^^
      checksum (checkbittoggle would be a better name!).............^^

Note that it is possible to join .TAP files by simply stringing them together; for example, in DOS / Windows: COPY /B FILE1.TAP + FILE2.TAP ALL.TAP ; or in Unix/Linux: cp file1.tap all.tap && cat file2.tap >> all.tap

For completeness, I'll include the structure of a tape header. A header always consists of 17 bytes:

Byte Length Description
0 1 Type (0,1,2 or 3)
1 10 Filename (padded with blanks)
11 2 Length of data block
13 2 Parameter 1
15 2 Parameter 2

The type is 0,1,2 or 3 for a Program, Number array, Character array or Code file. A SCREEN$ file is regarded as a Code file with start address 16384 and length 6912 decimal. If the file is a Program file, parameter 1 holds the autostart line number (or a number >=32768 if no LINE parameter was given) and parameter 2 holds the start of the variable area relative to the start of the program. If it's a Code file, parameter 1 holds the start of the code block when saved, and parameter 2 holds 32768. For data files finally, the byte at position 14 decimal holds the variable name.

(originally from TECHINFO.DOC supplied with Z80 by Gerton Lunter)

De-facto TZX format conventions

The TZX format specification says that text fields should exclusively use ASCII symbols. Over time, this has been found to fall short of what has been required in creating Archive info blocks which at least require the pound and Euro currency symbols as well as accented characters (for European names) and should probably also extend to at least Cyrillic text in the future to accommodate software from Eastern Europe.

While the 1.20 TZX format updates were being drafted it was agreed that as a practical first step that the string encoding would be redefined to be ISO Latin 1 (also known as ISO 8859-1) which would formalise the encoding used for the pound sign being used by World of Spectrum and allow the use of many more accented characters at the same time. Unfortunately this text was accidentally omitted from the final document leaving the change as an informal extension.

Since then there have been several ZX Spectrum software releases sold in Euros which has created a desire to support that symbol in the Archive info blocks as well. This is missing from the ISO Latin 1 character set so a new approach was required to achieve the goal.

A discussion on the World of Spectrum forums considered some options on how this could be accommodated without too much disruption to existing software and tools. The result of the discussion was that World of Spectrum has moved to using the Windows code page 1252 character mapping for the Euro symbol in its Archive info blocks from the Infoseek tool and that character mapping should be considered a de-facto standard for TZX files in any tool wanting to correctly interpret blocks from this source.

Hopefully a future revision of the TZX file format will formalise this usage and endorse the use of UTF-8 in the future to support the remaining characters missing from Windows code page 1252.

PZX format

The PZX format is another file format for efficient storage of tape-saved data designed to be simpler to support in utilities while retaining the important features of the TZX format. The official PZX specification can be found on its own home page.

Snapshot formats

ZX-State (SZX) format

This section has been moved or is in the process of being moved to the Sinclair FAQ Wiki, under the "ZX-State format" article. You may find more complete information there.

Moved to Sinclair FAQ Wiki by original author with no modifications by other parties.