THE GIF AND TIFF PICTURE FILE FORMATS BY: M†rten Lindstr”m ------------------------------------- Some time after I had written the IMG and IFF ILBM description (Ictari 16) I received the official TIFF and GIF documentation files. After digesting these and making some experiments of my own, I wrote this file. Although, the full official documentation should probably be available to all Ictari members (The TIFF documentation was actually published in the issue 16 mentioned above, the GIF documents I send to Ictari with this text) it is my hope that there will be still some value in a shorter description written by an Atari programmer for other Atari programmers to read. Included is also - yet another - description of LZW un/packing, which I hope to be easier to understand than the previously published ones (at least it's different), and also closer, I think, to how it would actually be programmed in an effective way. (See next month). General notes about GIF and TIFF -------------------------------- GIF (Graphics Interchange Format) was designed by CompuServe, primarily intended for telecommunication uses. It can handle palette colour images only (max. 256 colours), which are always LZW compressed. TIFF (Tag Image File Format) was made up by Aldus and Microsoft primarily for DTP, and originally couldn't handle colour images. Its design was however from the start very flexible (a bit like IFF - my personal favourite), and it was soon extended to handle any type of bitmap image, thereby adopting the LZW scheme of GIF. (There now or in the future will exist JPEG compressed TIFF images as well, but about this I know nothing.) From an Atari user's point of view, both GIF and TIFF use a format for the image data that may make them less attractive than IMG or IFF ILBM, requiring a further - time and space demanding - conversion step in addition to the de/compression. LZW itself is in addition not as lightning fast as the compression schemes used in IFF ILBM and IMG. On the other hand LZW isn't exactly slow either, even on a bog standard ST, and although a certain delay (a good second or a few) is unavoidable during unpacking and conversion, this should be less noticeable when using a floppy disk where the file loading itself is probably the most time consuming step. For a floppy user a slightly shorter loading time, due to the effectiveness of LZW, could even make up for part of the unpacking time. (And maybe the lesser demands on disk space make up for the extra demands on internal memory work space?) GIF and TIFF are admittedly also more common formats in the general (=PC) computer world. Regarding the LZW effectiveness I made some - limited - experiments, where LZW in all cases beat the compression schemes of IMG and IFF ILBM. In some cases the best of the latter - the vertical word compression of (DeluxePaint) IFF ILBM (and Tiny) - came pretty close. But with other pictures (simple maps and charts not making full use of the available range of colours) the LZW victory was devastating. So before describing the file formats let's have a look at the image data themselves. The uncompressed image in GIF and TIFF -------------------------------------- A non-compressed monochrome image will be stored exactly the same in a TIFF file as it would in the Atari environment, except that the latter usually requires pixel rows to begin on word boundaries, while in TIFF they only have to begin on byte boundaries. In GIF each pixel, in a non-compressed image, takes up a full byte (with 7 leading zero-bits in the case of a mono image). This may seem a terrible waste but will not result in bigger files since the image data in these are always compressed. The real difference comes with colour palette images, which are in the Atari environment bitplane separated or interleaved. Not so in GIF or TIFF. Instead the complete data for each pixel are stored in consecutive bits. Again, these in GIF are always allowed a full byte per pixel, while in TIFF smaller than 8-bit values are packed into bytes (as tightly as possible "left to right", i.e. first using the most significant bits of each byte, and with no unused bits except at the end of a line). Every line begins on a byte boundary. Example: A pixel row of a 3 pixel wide 16-colour image could in the Atari environment look like: bitplane 0 bitplane 1 bitplane 2 bitplane 3 % 001------------- 011------------- 101------------- 111------------- pix012 012 012 012 (where the colour number for each pixel is formed by the corresponding bits of each plane, beginning with bitplane 3 and ending with bit 0.) In GIF this would (non-compressed) look like: bit 76543210 76543210 76543210 % 00001100 00001010 00001111 pixel 0 1 2 And in TIFF: bit 76543210 76543210 % 11001010 1111---- pixel 0 1 2 fill In TIFF RGB images the RGB values for each pixel are by default stored as three consecutive values (like the Falcon High Colour screen memory is organized), but can also be stored in three separate "sample planes" (see PlanarConfiguration in the TIFF description.) In GIF no values greater than 8-bit have been foreseen, while the TIFF definition states that in such an unlikely case, values are to be packed into words (or longs if >16-bit) rather than bytes. The significance of this is that each line must then begin on a WORD boundary, plus that the processor type must be taken into account (which is given as the first word in the TIFF file header). COMPRESSING IT The above described formats are, in all cases, the formats which serve as input for compression and output for decompression. All the compression schemes (that I know of) work in a straight forward fashion line by line. (And none has in fact such a feature as even the line repeat of IMG.) LZW (Lempel-Ziv & Welch) compression ------------------------------------ LZW is used in all GIF (there is no such thing as a non-compressed GIF file) and probably most TIFF files, and the implementation of it is very similar in GIF and TIFF. In fact it should be, since the TIFF designers acknowledge to have essentially borrowed the LZW of GIF. In spite of this there are a few differences. Below is first described LZW for all TIFF and 8-bit GIF images. Then will be explained the minor modification for GIF with less than 8 bits/pixel: THE STRING TABLE: | At both encoding and decoding a table of byte-strings encountered in | the image is used. This table is initiated to contain, as its 256 | first entries (0-255), every possible one-byte string; the rest of | its strings are added during encoding/decoding - got from the | encoded/decoded image itself. A maximum of 4096 entries are used in | GIF and TIFF LZW (0-4095). In practice I think the table is most effectively made up just from byte-counts plus pointers into the already processed image data. Alternatively, especially during encoding, each string could be represented by a reference to a previously used string plus one extra byte - or, perhaps even better, the references could go the other way forming a tree structure to reduce the time spent on string comparisons during encoding. (Another way of reducing string comparisons is a technique called "hashing" whereby a formula is defined to calculate for each possible string a short numerical value - as a, simple though not ideal, example adding all the bytes of the string and skipping any overflow. String comparisons can then be limited to strings with the same numerical value.) COMPRESSION: | The input (non-compressed) image is read byte by byte. Each new | byte is "added" (concatenated) to the previous one to form a | growing byte string, as long as it can be found in the table of | already encountered strings. | When not, the existing string (before concatenation) is used as | output encoded as its table entry number, the would-have-been string | is added as a new table entry after the last, and a new string is | begun with the just read byte as its first byte. | Before compression begins, the "current string" is initiated to a | null string. DECOMPRESSION: | The input (compressed) image is read code by code, using each code | read as an index into the string table. A new entry, at the end of | the table, is formed by using as length the length of the looked up | string PLUS ONE, and as pointer the current output pointer BEFORE | OUTPUT. Then the looked up string is output, after which a new code | is read. Thus the same string table will be automatically re- | constructed as was used at the time of encoding. (I assume here that | string copying is done first byte first, since in some cases the | string to output will be missing its last byte until it is filled in | as the first copied byte - i.e. from the string beginning.) In GIF and TIFF LZW the first two free table entries (right after the one-byte strings) are reserved, since the corresponding codes have special meanings: 256 is the "Clear" code, to indicate that the string table should be (re-)initialised and all but the one-byte strings cleared. It should be written as the first of all codes in any image (/TIFF strip) and can be used again later at any time. 257 is the EndOfInformation (EOI) code to be written as the last code of the image (/TIFF strip). So the first (2-byte) string actually encountered in the image will be entered as 258. The codes, corresponding to table entry numbers, are to begin with 9- bit numbers. Since no codes higher than 511 are possible to express in 9 bits, the code size is increased to 10 bits when entry 512 is created. The exact point when to do this is the first difference between GIF and TIFF: GIF neatly does this no sooner than exactly when needed. That is THE STEP AFTER WHEN ENTRY 512 is created (the one in which entry 513 is to be created). This is the first time that the CODE 512 could possibly be used/encountered. TIFF rashly does it one step earlier. I.e. IN THE SAME STEP AS ENTRY 512 is created or right after entry 511 has been created. This means that a few bits are unnecessarily wasted in a TIFF file, but isn't perhaps much to make a fuss about. Similarly, table entry 1024 marks the beginning of 11-bit codes and entry 2048 signals 12-bit codes. The code numbers are packed in consecutive BYTES, not words, which means that the compressed data don't have to begin on a word boundary (and that the TIFF processor specification can be ignored). But how this is done is the second difference between GIF and TIFF LZW. TIFF Does it the logical way "left to right". I.e. codes ABCDEFGHI and jklmnopqr would be packed as: ABCDEFGH Ijklmnop qr------ GIF, for pure spite against us Motorola programmers and a desire to see us suffer, awkwardly packs the two codes above as: BCDEFGHI lmnopqrA ------jk. As can be seen this forces us to reverse the byte order and then read the codes backwards to get it right. Neither GIF nor TIFF allows the table to go beyond entry 4095 or the code length beyond 12 bits. But the method used to enforce this is another difference between GIF and TIFF LZW: TIFF simply puts a requirement on the ENCODER to issue a Clear code before the problem arises. This must then be done as soon as entry 4094 has been created (If we wait until after 4095, the Clear code itself could be misinterpreted to mark the addition of an entry 4096 and thus be read as 13-bit.) GIF on the other hand requires both ENCODER AND DECODER to make sure that entry 4095 is the last to be added, and that the code length remains 12-bit. Encoding/decoding proceeds normally using the table entries already defined. No Clear code needs to be written. (To be sure some old GIF decoders may not adhere to this, so an encoder that wants to be very nice could issue Clear codes anyway, after the creation of entry 4095.) When a Clear IS written, the last byte read before it (which couldn't have been represented in the string encoded just before the Clear) must be written after the Clear as a 9-bit code. (Or it could be written before it as a 12-bit number or whatever. If the latter method is used in TIFF, the Clear must come right after the creation of entry 4093 at the latest.) The decoding process, encountering a Clear, only have to clear the extra table entries and revert to 9-bit codes, then can continue as before. The LZW of GIF and TIFF doesn't bother about whether or not encoded strings make halt at line ends. Only at the image end (strip end in TIFF) will the compression algorithm stop - and add the EndOfInformation code. The decoding process will not pause until it finds this EOI. LESS THAN 8-BIT GIF: In GIF, unlike TIFF, all the bits of each byte are not used in images with less than 8 bits/pixel (see the uncompressed format above). Thus the number of possible one-byte strings is less than 256, and this is used in GIF to reduce the initial code length and lower the number of the first free table entry. In the beginning of the image data in a GIF file will always be found a number X which is used as follows: Initial code length = X+1 bits Clear code = 1< 1 - an image reader should always look for one. If encountered it overrides any PhotometricInterpretation field (which writers still are advised to include as well though.) Note that 0 (no 'greyness' = white) corresponds to maximum intensity and vice versa, so the normal Photometric- Interpretation=1 (i.e. sample 0=black) corresponds to a decreasing GrayResponseCurve. The grey density values are read as decimal fractions according to above GrayResponseUnit field, typically on a scale from 2.0 (maximal 'greyness' = black) to 0.0 (white). If an exact physical scale is unknown, the following formula is suggested by the TIFF definers to calculate reasonable values in between: Constant*10log(MaxIntensity/Intensity), where the Constant should be selected to make the density decrease from intensity 0 (set to density 2.0) to intensity 1 significantly steeper than the following decreases. E.g. for a 16-step scale (MaxIntensity=15) the Constant can be set =1 which would if GrayResponseUnit=3 give: (2000,) 1176, 875, 699, 574, 477, 398, 331, 273, 222, 176, 135, 97, 62, 30, 0. (For more advanced curves the TIFF definers refers to Kodak Reflection Density Guide, catalogue number 146 5947.) No default mentioned. 318 WhitePoint Type = 5 (rationals, i.e. pairs of longs: numerator, denominator) # = 2 This and the following are for the real pros. In RGB images. "White point" of the image, in the 1931 CIE xyY chromaticity diagram, omitting the luminance (last coordinate). Default is the SMPTE white point, D65: x=0.313, y=0.329. 319 PrimaryChromaticities Type = 5 (rationals) # = 6 Primary colour chromaticities: red x,y, green x,y, blue x,y. Default is the SMPTE primary colour chromaticities: red x=0.635, y=0.340, green x=0.305, y=0.595, blue x=0.155, y=0.070. Informational Fields -------------------- 315 Artist Type = 2 (ascii) Person who created the image, plus any copyright message. You may want to put the actual string in the TIFF file beginning, right after the 8 byte header; the TIFF system with pointers to IFD as well as to field contents, allows this to be easily done. 306 DateTime Type = 2 (ascii) # = 20 (i.e. 19 characters + a null) Date & time of image creation on the format: "YYYY:MM:DD HH:MM:SS", (HH = 00-23; space between date and time.) 270 ImageDescription Type = 2 (ascii) General short one-line comment e.g. "1988 company picnic" 316 HostComputer Type = 2 (ascii) E.g. "Atari ST" 271 Make Type = 2 (ascii) Manufacturer of scanner, video digitizer, or whatever. 272 Model Type = 2 (ascii) Model name/number of scanner, video digitizer, or whatever. 305 Software Type = 2 (ascii) Name and release number of software package that created image. Document fields --------------- Fields probably intended primarily for use with fax documents, but might be found useful for other things. Not required in classes B,G,P,R. 297 PageNumber Type = 3 (words) # = 2 For a multiple page (e.g. fax) document: First word : Page number (beginning with 0) Second word: Total number of pages in document. Pages need not appear in numerical order. No default. 269 DocumentName Type = 2 (ascii) Name of the document from which this image was scanned. 285 PageName Type = 2 (ascii) Name of the page from which this image was scanned. No default. 286 XPosition Type = 5 (rational, ie 2 longs: first numerator then denominator) # = 1 X offset of image in page, in ResolutionUnits. No default. 287 YPosition Type = 5 (rational, ie 2 longs: first numerator then denominator) # = 1 Y offset of image in page, in ResolutionUnits. No default. 266 FillOrder Type = 3 (word) # = 1 This is an old field, no longer recommended in normal TIFF files, but which IS required in TIFF F (fax documents). It defines how the bits in image data bytes are to be read: 1 = 'Normal left to right' i.e. most significant bit first. 2 = 'Backwards' least significant bit first. Default is 1.