The ZIX format (2.0 and later)

Kenneth Sörling (nevershaveyourduck@gmail.com)

NOTE: This document is tentative. Many details are sketchy at best at the time of writing.

Revision History:

September 7th, 2007
Initial release.

Overview:

The 2.0 format is radically different from the previous one. In many ways, it's even worse. While it has aquired compression capability (stolen it, really), it is still inferior to any other format, what with many necessary bits of info not included. Well, at least they included a checksum field for each file, so you could validate the decompressed data.

The new format is sorely lacking in other respects. It doesn't carry attribute or date/time information for its files. There is little or no provision for folder hierarchies. Dates of creation and modification are not preserved. Nor are file attributes. There is no provision for different code pages for storing file names.

However, one should keep in mind that these gripes are pointless. What we have here isn't an attempt at a new, improved format, but a continuation of a scam designed to force users to install the spyware-infested WinZix application.

Thanks to many alert users who supplied me with pointers to ZIX archives, I have now deduced enough information to foil them. My own UnZixWin utility is now able to deal with these files.

Difference from version 1.0 archives:

File Layout

HEADER:

The first six bytes are ASCII "WINZIX"

The next two bytes are always (00 03)

All version 2 archives encountered so far have had this header. Significance of the numbers is unknown. It doesn t match the file count. My guess is it's the internal format number used by the creator.

File Records:

The following sequence repeats for each file record:

This structure works so far, but is suspicious on account of being stupid. In particular, the size of the FN_LEN field. 8 bytes? Most file systems allow for only 256-character file names, so a single byte might have sufficed. Two bytes would have been forward-thinking. 8 bytes is a ridiculous overkill.

Therefore, these 8 bytes might actually comprise 2 DWORDs, 4 SHORTs, or some other combination. Observed values so far has been zeroes in all but the bottom byte, where the length of the filename has so far always fitted.

Problems and unanswered questions

The number 3 in the file header suggest there was actually a previous format numbered 2, examples of which never seem to have occurred on the Net. Therefore, this version should really be called 3.0. However, to comply with common consensus on the web, I will keep referring to it as version 2.0 (since it is produced by WinZix 2).

All version 2 files encountered to date have followed this format. All of them have also contained junk data, although this is not caused by the WinZix program itself, but rather the bastards using it.

We now have enough information to extract files out of WinZix 2.0 files. What remains to be resolved is the significance of the unknown numbers.

No archive has yet been located which deviates from the above structure, but it would not surprise me to see ZIX archives with unexpected values in this structure. I set up my utility to watch for inconsistencies and prompt the user to get in touch if anything was out of the ordinary.

Occasionally, the MD5 checksum in the record differs with the decompressed file. But this might be due to data corruption, transmission errors, or obfuscation on the part of the creators. More often than not, the MD5 sums line up.