TIFF Revision 5.0

Aldus/Microsoft Technical Memorandumm
8/8/88
Preface

This memorandum  has been prepared jointly by Aldus and Microsoft
in conjunction  with leading scanner vendors and other interested
parties.   This document  does not  represent a commitment on the
part of  either Microsoft  or Aldus  to provide  support for this
file format  in any  application.   When responding  to  specific
issues raised  in this memo, or when requesting additional tag or
field assignments, please address your correspondence to either:

     Developers' Desk    Windows Marketing Group
     Aldus Corporation   Microsoft Corporation
     411 First Ave. South     16011 NE 36th Way
     Suite 200 Box 97017
     Seattle, WA  98104  Redmond, WA  98073-9717
     (206) 622-5500 (206) 882-8080

Revision Notes

This revision  replaces "TIFF  Revision 4."   Sections in italics
are new or substantially changed in this revision.  Also new, but
not in italics, are Appendices F, G, and H.

The major enhancements in TIFF 5.0 are:

1.   Compression of  grayscale and  color images, for better disk
space utilization.  See Appendix F.

2.   TIFF Classes - restricted TIFF subsets that can simplify the
job of  the TIFF  implementor.   You may  wish to scan Appendix G
before reading the rest of this document.   In fact, you may want
to use  Appendix G as your main guide, and refer back to the main
body of  the specification  as needed for details concerning TIFF
structures and field definitions.

3.   Support for  "palette color"   images.  See the TIFF Class P
description  in   Appendix  G,   and  the   new  ColorMap   field
description.

4.   Two new  tags that  can be  used to  more fully  define  the
characteristics of  full color  RGB data, and thereby potentially
improve the quality of color image reproduction.  See Appendix H.

The organization  of the  document has also changed slightly.  In
particular, the  tags are  listed in  alphabetical order,  within
several categories, in the main body of the specification.

As always,  every attempt  has been  made to add functionality in
such a  way as  to minimize  incompatibility problems  with older
TIFF software.   In  particular, many  TIFF  5.0  files  will  be
readable even  by older  applications that  assume TIFF 4.0 or an
earlier version  of the  specification.   One exception  is  with
files that  use the  new TIFF  5.0 LZW  compression scheme.   Old
applications will  have to  give up  in this case, of course, and
will do so "gracefully" if they have been following the rules.

We  are  grateful  to  all  of  the  draft  reviewers  for  their
suggestions.   Especially helpful  were Herb  Weiner  of  Kitchen
Wisdom  Publishing   Company,  Brad  Pillow  of  TrueVision,  and
engineers from Hewlett Packard and Quark.  Chris Sears of Magenta
Graphics provided information which is included as Appendix H.

Abstract

This document  describes TIFF,  a tag  based file  format that is
designed to promote the interchange of digital image data.

The fields  were defined  primarily with  desktop publishing  and
related applications  in mind, although it is possible that other
sorts of imaging applications may find TIFF to be useful.

The general  scenario for  which TIFF  was invented  assumes that
applications software  for scanning  or painting  creates a  TIFF
file, which  can then  be read and incorporated into a "document"
or "publication"   by an application such as a desktop publishing
package.

TIFF is  not a printer language or page description language, nor
is it intended to be a general document interchange standard. The
primary design  goal was  to provide  a rich  environment  within
which the exchange of image data between application programs can
be accomplished.   This  richness is  required in  order to  take
advantage of  the varying  capabilities of  scanners and  similar
devices.  TIFF is therefore designed to be a superset of existing
image file  formats for  "desktop"   scanners (and paint programs
and anything  else that  produces images with pixels in them) and
will be enhanced on a continuing basis as new capabilities arise.
A high  priority has been given to structuring the data in such a
way as  to minimize  the pain  of future  additions.    TIFF  was
designed to be an extensible interchange format.

Although TIFF  is claimed  to be  in some sense a rich format, it
can easily  be used for simple scanners and applications as well,
since the  application developer  need only be concerned with the
capabilities that he requires.

TIFF is intended to be independent of specific operating systems,
filing systems,  compilers, and processors.  The only significant
assumption is  that the  storage medium supports something like a
"file,"   defined as  a sequence  of 8-bit bytes, where the bytes
are numbered  from 0  to N.   The  largest possible  TIFF file is
2**32 bytes  in length.   Since TIFF uses pointers (byte offsets)
quite liberally,  a TIFF  file is  most easily read from a random
access device  such as a hard disk or flexible diskette, although
it should  be possible  to read  and write TIFF files on magnetic
tape.

The recommended  MS-DOS, UNIX,  and OS/2  file extension for TIFF
files is  ".TIF."   The recommended Macintosh filetype is "TIFF".
Suggestions for  conventions in  other computing environments are
welcome.

1) Structure

In TIFF, individual fields are identified with a unique tag. This
allows particular fields to be present or absent from the file as
required by the application.  For an explanation of the rationale
behind using a tag structure format, see Appendix A.

A TIFF file begins with an 8-byte "image file header" that points
to one  or  more  "image  file  directories."    The  image  file
directories contain  information about  the images,  as  well  as
pointers to the actual image data.

See Figure 1.

We will now describe these structures in more detail.

Image file header

A TIFF  file begins  with an 8-byte image file header, containing
the following information:

Bytes 0-1:     The first  word of  the file  specifies  the  byte
order used within the file.  Legal values are:

          "II" (hex 4949)
          "MM" (hex 4D4D)

     In the  "II"   format,  byte  order  is  always  from  least
significant to  most significant,  for  both  16-bit  and  32-bit
integers.   In the  "MM"   format, byte order is always from most
significant to  least significant,  for both  16-bit  and  32-bit
integers.   In both  formats, character  strings are  stored into
sequential byte locations.

     All  TIFF  readers should  support  both  byte  orders - see
Appendix G.

Bytes 2-3 The second  word of  the  file  is  the  TIFF  "version
number."   This number, 42 (2A in hex), is not to be equated with
the current  Revision of  the TIFF  specification.   In fact, the
TIFF version  number (42)  has never  changed, and probably never
will.   If it  ever does,  it means that TIFF has changed in some
way so  radical that  a TIFF  reader should  give up immediately.
The number 42 was chosen for its deep philosophical significance.
It can and should be used as additional verification that this is
indeed a TIFF file.

     A TIFF  file does  not have  a real version/revision number.
This was  an explicit,  conscious design  decision.  In many file
formats, fields  take on different meanings depending on a single
version number.   The  problem is that as the file format "ages,"
it becomes  increasingly difficult  to document which fields mean
what in  a given  version, and older software usually has to give
up if  it encounters  a file  with a  newer version  number.   We
wanted TIFF  fields to have a permanent and well-defined meaning,
so that  "older" software  can usually  read "newer"  TIFF files.
The bottom line is lower software release costs and more reliable
software.

Bytes 4-7 This long  word contains  the offset  (in bytes) of the
first Image File Directory.  The directory may be at any location
in the  file after  the header but must begin on a word boundary.
In particular,  an Image File Directory may follow the image data
it describes.   Readers must simply follow the pointers, wherever
they may lead.

     (The term  "byte offset"  is always used in this document to
refer to  a location  with respect  to the beginning of the file.
The first byte of the file has an offset of 0.)

Image file directory

An Image  File Directory  (IFD) consists of a 2-byte count of the
number of  entries (i.e.,  the number  of fields),  followed by a
sequence of 12-byte field entries, followed by a 4-byte offset of
the next  Image File  Directory (or 0 if none).  Do not forget to
write the 4 bytes of 0 after the last IFD.

Each 12-byte IFD entry has the following format:

Bytes 0-1 contain the Tag for the field.
Bytes 2-3 contain the field Type.
Bytes 4-7 contain the  Length ("Count"  might have  been a better
term) of the field.
Bytes 8-11     contain the  Value Offset,  the  file  offset  (in
bytes) of  the Value  for the  field.   The Value  is expected to
begin on  a word  boundary; the  corresponding Value  Offset will
thus be  an even  number.  This file offset may point to anywhere
in the file, including after the image data.

The entries  in an  IFD must be sorted in ascending order by Tag.
Note that this is not the order in which the fields are described
in this  document.   For a  numerically ordered list of tags, see
Appendix E.  The Values to which directory entries point need not
be in any particular order in the file.

In order  to save time and space, the Value Offset is interpreted
to contain  the Value  instead of  pointing to  the Value  if the
Value fits  into 4  bytes.  If the Value is less than 4 bytes, it
is left-justified within the 4-byte Value Offset, i.e., stored in
the lower-numbered bytes.  Whether or not the Value fits within 4
bytes is  determined by  looking at  the Type  and Length  of the
field.

The Length  is specified in terms of the data type, not the total
number of bytes.  A single 16-bit word (SHORT) has a Length of 1,
not 2,  for example.   The  data  types  and  their  lengths  are
described below:

1 = BYTE  An 8-bit unsigned integer.
2 = ASCII 8-bit bytes  that store ASCII codes; the last byte must
be null.
3 = SHORT A 16-bit (2-byte) unsigned integer.
4 = LONG  A 32-bit (4-byte) unsigned integer.
5 = RATIONAL   Two LONG's:  the first represents the numerator of
a fraction, the second the denominator.

The value of the Length part of an ASCII field entry includes the
null.   If padding  is necessary, the Length does not include the
pad byte.   Note  that there  is no  "count byte," as there is in
Pascal-type strings.   The Length part of the field takes care of
that.   The null  is not  strictly necessary, but may make things
slightly simpler for C programmers.

The reader  should check  the type  to ensure  that it is what he
expects.   TIFF currently  allows more than 1 valid type for some
fields.   For example,  ImageWidth and ImageLength were specified
as having  type SHORT.  Very large images with more than 64K rows
or columns  are possible with some devices even now.  Rather than
add parallel  LONG tags  for these fields, it is cleaner to allow
both SHORT  and LONG  for ImageWidth  and similar  fields.    See
Appendix G for specific recommendations.

Note that  there may  be more  than one IFD.  Each IFD is said to
define a "subfile."   One potential use of subsequent subfiles is
to describe  a "sub-image"   that  is somehow related to the main
image, such as a reduced resolution version of the image.

If you have not already done so, you may wish to turn to Appendix
G to study the sample TIFF images.

2) Definitions

Note that the TIFF structure as described in the previous section
is not  specific to  imaging applications in any way.  It is only

the definitions of the fields themselves that jointly describe an
image.

Before we  begin defining  the fields,  we will define some basic
concepts.   An image  is defined  to be  a rectangular  array  of
"pixels,"  each of which consists of one or more "samples."  With
monochromatic data,  we have  one sample  per pixel, and "sample"
and "pixel"   can  be  used  interchangeably.    RGB  color  data
contains three samples per pixel.

3) The Fields

This section  describes the  fields defined  in this  version  of
TIFF.   More fields may be added in future versions - if possible
they will  be added in such a way so as not to break old software
that encounters a newer TIFF file.

The documentation  for each  field contains the name of the field
(quite arbitrary, but convenient), the Tag value, the field Type,
the Number of Values (N) expected, comments describing the field,
and the  default, if  any.  Readers must assume the default value
if the field does not exist.

"No default"  does not  mean that  a TIFF  writer should  not pay
attention to  the tag.  It simply means that there is no default.
If the  writer has reason to believe that readers will care about
the value  of this  field, the writer should write the field with
the appropriate value.  TIFF readers can do whatever they want if
they encounter a missing "no default" field that they care about,
short of  refusing to  import the file.  For example, if a writer
does  not  write  out  a  PhotometricInterpretation  field,  some
applications will  interpret the  image "correctly,"  and  others
will display  the image  inverted.  This is not a good situation,
and writers should be careful not to let it happen.

The  fields   are  grouped   into  several  categories:    basic,
informational, facsimile,  document storage and retrieval, and no
longer recommended.   A  future version  of the specification may
pull some of these categories into separate companion documents.

Many fields  are described  in this  document, but  most are  not
"required."   See Appendix  G for  a list  of required fields, as
well as  examples of  how to combine fields into valid and useful
TIFF files.
Basic Fields

Basic fields  are  fields  that  are  fundamental  to  the  pixel
architecture or visual characteristics of an image.

BitsPerSample
Tag  = 258  (102)
Type = SHORT
N    = SamplesPerPixel

Number of bits per sample.  Note that this tag allows a different
number of  bits per  sample for  each sample  corresponding to  a
pixel.   For example, RGB color data could use a different number
of bits  per sample for each of the three color planes.  Most RGB
files will have the same number of BitsPerSample for each sample.
Even in this case, be sure to include all three entries.  Writing
"8" when you mean "8,8,8" sets a bad precedent for other fields.

Default = 1.  See also SamplesPerPixel.

ColorMap
Tag  = 320 (140)
Type = SHORT
N    = 3 * (2**BitsPerSample)

This tag  defines a  Red-Green-Blue color  map for  palette color
images.   The palette color pixel value is used to index into all
3 subcurves.   For  example, a Palette color pixel having a value
of 0  would be  displayed according  to the 0th entry of the Red,
Green, and Blue subcurves.

The subcurves  are stored  sequentially.   The Red  entries  come
first, followed  by the  Green  entries,  followed  by  the  Blue
entries.   The length  of each  subcurve is  2**BitsPerSample.  A
ColorMap entry  for an  8-bit Palette color image would therefore
have 3  * 256  entries.   The width  of each entry is 16 bits, as
implied  by  the  type  of  SHORT.    0  represents  the  minimum
intensity, and  65535 represents the maximum intensity.  Black is
represented by  0,0,0, and  white by  65535, 65535,  65535.   The
purpose of  the color  map is to act as a "lookup"  table mapping
pixel values from 0 to 2**BitsPerSample-1 into RGB triplets.

The ColorResponseCurves  field may  be used  in conjunction  with
ColorMap to further refine the meaning of the RGB triplets in the
ColorMap.   However, the  ColorResponseCurves default  should  be
sufficient in most cases.

See also PhotometricInterpretation - palette color.

No default.   ColorMap  must be  included in  all  palette  color
images.

ColorResponseCurves
Tag  = 301 (12D)
Type = SHORT
N    = 3 * (2**BitsPerSample)

This tag  defines three  color response curves, one each for Red,
Green and  Blue color  information.   The Red entries come first,
followed by the Green entries, followed by the Blue entries.  The
length  of   each  subcurve   is  2**BitsPerSample,   using   the
BitsPerSample value corresponding to the respective primary.  The
width of  each entry is 16 bits, as implied by the type of SHORT.
0 represents  the minimum  intensity, and  65535  represents  the
maximum intensity.   Black  is represented by 0,0,0, and white by
65535, 65535,  65535.   Therefore, a ColorResponseCurve entry for
RGB data  where each of the samples is 8 bits deep would have 3 *
256 entries, each consisting of a SHORT.

The purpose of the color response curves is to refine the content
of RGB color images.

See Appendix H, section VII, for further information.

Default:  curves based on the NTSC recommended gamma of 2.2.

Compression
Tag  = 259  (103)
Type = SHORT
N    = 1

1 =  No compression,  but pack  data into  bytes  as  tightly  as
possible, with  no unused  bits except  at the end of a row.  The
bytes are  stored as  an array of type BYTE, for BitsPerSample <=
8,  SHORT   if  BitsPerSample   >  8  and  <=  16,  and  LONG  if
BitsPerSample >  16 and <= 32.  The byte ordering of data >8 bits
must be  consistent with  that specified  in the TIFF file header
(bytes 0  and 1).   "II"    format  files  will  have  the  least
significant bytes  preceeding the  most significant  bytes  while
"MM"  format files will have the opposite.

     If the  number of  bits per  sample is not a power of 2, and
you are willing to give up some space for better performance, you
may wish to use the next higher power of 2.  For example, if your
data can  be represented  in 6 bits, you may wish to specify that
it is 8 bits deep.

     Rows are  required to  begin on byte boundaries.  The number
of bytes  per row  is therefore  (ImageWidth *  SamplesPerPixel *
BitsPerSample  +   7)  /  8,  assuming  integer  arithmetic,  for
PlanarConfiguration=1.     Bytes  per   row  is   (ImageWidth   *
BitsPerSample + 7) / 8 for PlanarConfiguration=2.

     Some graphics  systems want rows to be word- or double-word-
aligned.   Uncompressed TIFF  rows will  need to  be copied  into
word- or  double-word-padded row  buffers before  being passed to
the graphics routines in these environments.

2 =  CCITT Group  3 1-Dimensional  Modified  Huffman  run  length
encoding.   See Appendix  B:   "Data Compression  --  Scheme  2."
BitsPerSample must  be 1,  since  this  type  of  compression  is
defined only for bilevel images.

     When  you  decompress  data  that  has  been  compressed  by
Compression=2, you  must translate  white runs into 0's and black
runs into  1's.   Therefore, the normal PhotometricInterpretation
for those  compression types  is 0  (WhiteIsZero).   If a  reader
encounters a  PhotometricInterpretation of  1  (BlackIsZero)  for
such an  image, the  image should  be displayed  and printed with
black and white reversed.

5 = LZW Compression,  for grayscale, mapped color, and full color
images.  See Appendix F.

32773 =  PackBits compression,  a simple byte oriented run length
scheme for 1-bit images.  See Appendix C.

Data compression only applies to raster image data, as pointed to
by StripOffsets.  All other TIFF information is unaffected.

Default = 1.

GrayResponseCurve
Tag  = 291 (123)
Type = SHORT
N    = 2**BitsPerSample

The purpose  of the  gray response curve and the gray units is to
provide more  exact photometric  interpretation  information  for
gray scale image data, in terms of optical density.

The  GrayScaleResponseUnits   specifies  the   accuracy  of   the
information contained  in the  curve.   Since optical  density is
specified in  terms of  fractional numbers, this tag is necessary
to know  how to  interpret the  stored integer  information.  For
example, if  GrayScaleResponseUnits is  set to 4 (ten-thousandths
of a  unit), and a GrayScaleResponseCurve number for gray level 4
is 3455,  then the  resulting actual  value is  0.3455.   Optical
densitometers typically measure densities within the range of 0.0
to 2.0.

If the  gray scale  response curve  is known  for the data in the
TIFF file, and if the gray scale response of the output device is
known, then  an intelligent  conversion can  be made  between the
input data and the output device.  For example, the output can be
made to  look just  like the  input.   In addition,  if the input
image lacks  contrast (as  can be  seen from the response curve),
then appropriate contrast enhancements can be made.

The purpose  of the  gray scale  response curve  is to  act as  a
"lookup"   table mapping values from 0 to 2**BitsPerSample-1 into
specific   density    values.      The   0th   element   of   the
GrayResponseCurve array  is used to define the gray value for all
pixels  having   a  value   of  0,   the  1st   element  of   the
GrayResponseCurve array  is used to define the gray value for all
pixels having  a value of 1, and so on, up to 2**BitsPerSample-1.
If your  data is  "really," say, 7-bit data, but you are adding a
1-bit pad  to each  pixel to  turn it into 8-bit data, everything
still works:   If  the data is high-order justified, half of your
GrayResponseCurve entries  (the odd ones, probably) will never be
used, but  that doesn't  hurt anything.  If the data is low-order
justified, your  pixel values  will be between 0 and 127, so make
your GrayResponseCurve  accordingly.   What your  curve does from
128 to  255 doesn't matter.  Note that low-order justification is
probably not  a good  idea, however,  since not  all applications
look at GrayResponseCurve.  Note also that LZW compression yields
the same  compression ratio  regardless of  whether the  data  is
high-order or low-order justified.

It is  permissable to  have a  GrayResponseCurve even for bilevel
(1-bit) images.   The  GrayResponseCurve will  have 2 values.  It
should be noted, however, that TIFF B readers are not required to
pay attention  to  GrayResponseCurves  in  TIFF  B  files.    See
Appendix G.

If both  GrayResponseCurve and  PhotometricInterpretation  fields
exist  in   the  IFD,   GrayResponseCurve  values   override  the
PhotometricInterpretation value.   But it is a good idea to write
out both, since some applications do not yet pay attention to the
GrayResponseCurve.

Writers may  wish to  purchase a  Kodak Reflection Density Guide,
catalog number  146 5947,  available for  $10 or  so at  prepress
supply houses,  to help them figure out reasonable density values
for their scanner or frame grabber.  If that sounds like too much
work,   we    recommend   a    curve   that    is    linear    in
intensity/reflectance.  To compute reflectance from density:  R =
1 /  pow(10,D).   To compute  density from reflectance: D = log10
(1/R).   A typical  4-bit GrayResponseCurve  may  look  therefore
something like:   2000,  1177, 875, 699, 574, 477, 398, 331, 273,
222, 176,  135, 97, 62, 30, 0, assuming GrayResponseUnit=3.  Such
a curve would be consistent with PhotometricInterpretation=1.

See also GrayResponseUnit, PhotometricInterpretation, ColorMap.

GrayResponseUnit
Tag  = 290 (122)
Type = SHORT
N    = 1

1 = Number represents tenths of a unit.
2 = Number represents hundredths of a unit.
3 = Number represents thousandths of a unit.
4 = Number represents ten-thousandths of a unit.
5 = Number represents hundred-thousandths of a unit.

Modifies GrayResponseCurve.

See also GrayResponseCurve.

For historical  reasons, the  default is 2.  However, for greater
accuracy, we recommend using 3.

ImageLength
Tag  = 257  (101)
Type = SHORT or LONG
N    = 1

The image's  length (height) in pixels (Y: vertical).  The number
of rows  (sometimes described as "scan lines") in the image.  See
also ImageWidth.

No default.

ImageWidth
Tag  = 256  (100)
Type = SHORT or LONG
N    = 1

The image's  width, in  pixels (X:  horizontal).   The number  of
columns in the image.  See also ImageLength.

No default.

NewSubfileType
Tag =  254  (FE)
Type = LONG
N = 1

Replaces the  old SubfileType  field, due  to limitations  in the
definition of that field.

A general  indication of  the kind  of data  that is contained in
this subfile.   This  field is  made up of a set of 32 flag bits.
Unused bits are expected to be 0.  Bit 0 is the low-order bit.

Currently defined values are:

Bit 0     is 1  if the  image is  a reduced resolution version of
another image in this TIFF file; else the bit is 0.
Bit 1     is 1  if the  image is  a single  page of  a multi-page
image (see the PageNumber tag description); else the bit is 0.
Bit 2     is 1  if the  image defines  a  transparency  mask  for
another image  in this  TIFF file.  The PhotometricInterpretation
value must be 4, designating a transparency mask.

These values  have been  defined as  bit flags  because they  are
pretty much independent of each other.  For example, it be useful
to have  four images  in a  single TIFF  file: a  full resolution
image, a  reduced resolution  image, a  transparency mask for the
full resolution  image, and  a transparency  mask for the reduced
resolution image.  Each of the four images would have a different
value for the NewSubfileType field.

Default is 0.

PhotometricInterpretation
Tag  = 262  (106)
Type = SHORT
N    = 1

0 =  For bilevel  and grayscale  images:   0 is  imaged as white.
2**BitsPerSample-1 is  imaged as  black.    If  GrayResponseCurve
exists,  it   overrides  the   PhotometricInterpretation   value,
although  it  is  safer  to  make  them  match,  since  some  old
applications may still be ignoring GrayResponseCurve. This is the
normal value for Compression=2.

1 =  For bilevel  and grayscale  images:   0 is  imaged as black.
2**BitsPerSample-1  is  imaged  as  white.  If  GrayResponseCurve
exists,  it   overrides  the   PhotometricInterpretation   value,
although  it  is  safer  to  make  them  match,  since  some  old
applications may  still be  ignoring GrayResponseCurve.  If  this
value is  specified for  Compression=2, the  image should display
and print reversed.

2 = RGB.  In the RGB model, a color is described as a combination
of the  three primary  colors of  light (red, green, and blue) in
particular concentrations.   For  each of  the three  samples,  0
represents minimum intensity, and 2**BitsPerSample - 1 represents
maximum intensity.   Thus  an RGB  value  of  (0,0,0)  represents
black,  and   (255,255,255)  represents   white,  assuming  8-bit
samples.   For PlanarConfiguration = 1, the samples are stored in
the indicated  order:   first Red,  then Green,  then Blue.   For
PlanarConfiguration =  2, the  StripOffsets for the sample planes
are stored  in the  indicated order:   first the Red sample plane
StripOffsets, then  the Green  plane StripOffsets,  then the Blue
plane StripOffsets.

     The ColorResponseCurves field may be used to globally refine
or alter  the color  balance of  an RGB  image without  having to
change the values of the pixels themselves.

3="Palette color."     In this  mode, a color is described with a
single sample.   The  sample is  used as  an index into ColorMap.
The sample  is used to index into each of the red, green and blue
curve tables to retrieve an RGB triplet defining an actual color.
When this  PhotometricInterpretation value  is  used,  the  color
response curves  must also  be supplied.  SamplesPerPixel must be
1.

4 =  Transparency Mask.   This  means that  the image  is used to
define an  irregularly shaped region of another image in the same
TIFF  file.     SamplesPerPixel  and  BitsPerSample  must  be  1.
PackBits compression  is recommended.    The  1-bits  define  the
interior of  the region;  the 0-bits  define the  exterior of the
region.  The Transparency Mask must have the same ImageLength and
ImageWidth as the main image.

     A reader  application can  use the  mask to  determine which
parts of the image to display.  Main image pixels that correspond
to 1-bits  in the  transparency mask  are imaged to the screen or
printer, but  main image  pixels that correspond to 0-bits in the
mask are not displayed or printed.

     It is  possible to  generalize the  notion of a transparency
mask to  include partial  transparency, but  it is not clear that
such information would be useful to a desktop publishing program.

No default.   That  means that  if you  care  if  your  image  is
displayed and  printed as  "normal" vs "inverted," you must write
out this  field.   Do not rely on applications defaulting to what
you want!   PhotometricInterpretation  =  1  is  recommended  for
bilevel (except  for Compression=2)  and grayscale images, due to
popular user  interfaces for changing the brightness and contrast
of images.

PlanarConfiguration
Tag  = 284  (11C)
Type = SHORT
N    = 1

1 =  The sample values for each pixel are stored contiguously, so
that   there    is   a    single   image    plane.            See
PhotometricInterpretation to  determine the  order of the samples
within the  pixel data.   So,  for RGB  data, the  data is stored
RGBRGBRGB...and so on.

2 =  The samples  are stored  in separate  "sample planes."   The
values in StripOffsets and StripByteCounts are then arranged as a
2-dimensional array, with SamplesPerPixel rows and StripsPerImage
columns.      (All of  the columns  for row  0 are  stored first,
followed   by    the   columns    of   row   1,   and   so   on.)
PhotometricInterpretation describes  the type  of  data  that  is
stored in  each sample  plane.   For example,  RGB data is stored
with the  Red samples  in one sample plane, the Green in another,
and the Blue in another.

If SamplesPerPixel  is 1,  PlanarConfiguration is irrelevant, and
should not be included.
Default is 1.  See also BitsPerSample, SamplesPerPixel.

Predictor
Tag  = 317 (13D)
Type = SHORT
N    = 1

To be used when Compression=5 (LZW).  See Appendix F.

1 = No prediction scheme used before coding.

Default is 1.

ResolutionUnit
Tag  = 296 (128)
Type = SHORT
N    = 1

To be used with XResolution and YResolution.

1 =  No absolute  unit of  measurement.  Used for images that may
have a  non-square  aspect  ratio,  but  no  meaningful  absolute
dimensions.   The drawback  of ResolutionUnit=1 is that different
applications will  import the  image at different sizes.  Even if
the decision  is quite  arbitrary, it might be better to use dots
per inch  or  dots  per  centimeter,  and  pick  XResolution  and
YResolution such that the aspect ratio is correct and the maximum
dimension of  the image is about four inches (the "four" is quite
arbitrary.)
2 = Inch.
3 = Centimeter.

Default is 2.  See also XResolution, YResolution.

RowsPerStrip
Tag  = 278  (116)
Type = SHORT or LONG
N    = 1

The number  of rows  per strip.  The image data is organized into
strips for  fast access  to individual  rows  when  the  data  is
compressed - though this field is valid even  if the  data is not
compressed.

RowsPerStrip and  ImageLength together  tell  us  the  number  of
strips in  the entire  image.   The equation  is StripsPerImage =
(ImageLength + RowsPerStrip - 1) / RowsPerStrip, assuming integer
arithmetic.

Note that  either SHORT  or LONG  values can  be used  to specify
RowsPerStrip.   SHORT values  may be  used for  small TIFF files.
It should  be noted,  however, that  earlier  TIFF  specification
revisions required  LONG values  and that  some software  may not
expect SHORT values.  See Appendix G for further recommendations.

Default is  2**32 -  1, which  is effectively infinity.  That is,
the entire  image is  one strip.   We  do not  recommend a single
strip, however.   Choose  RowsPerStrip such  that each  strip  is
about 8K  bytes, even  if the  data is  not compressed,  since it
makes buffering  simpler for  readers.   The "8K"  part is pretty
arbitrary, but seems to work well.

See also ImageLength, StripOffsets, StripByteCounts.

SamplesPerPixel
Tag  = 277  (115)
Type = SHORT
N    = 1

The number  of samples  per pixel.    SamplesPerPixel  is  1  for
bilevel, grayscale, and palette color images.  SamplesPerPixel is
3 for RGB images.

Default = 1.  See also BitsPerSample, PhotometricInterpretation.

StripByteCounts
Tag  = 279  (117)
Type = SHORT or LONG
N    = StripsPerImage for PlanarConfiguration equal to 1.
     = SamplesPerPixel  * StripsPerImage  for PlanarConfiguration
equal to 2

For each strip, the number of bytes in that strip.  The existence
of  this   field  greatly   simplifies  the  chore  of  buffering
compressed data, if the strip size is reasonable.

No default.  See also StripOffsets, RowsPerStrip.

StripOffsets
Tag  = 273  (111)
Type = SHORT or LONG
N    = StripsPerImage for PlanarConfiguration equal to 1.
     = SamplesPerPixel  * StripsPerImage  for PlanarConfiguration
equal to 2

For each  strip, the  byte offset  of that  strip.  The offset is
specified with  respect to  the beginning of the TIFF file.  Note
that this  implies that  each strip has a location independent of
the locations  of other  strips.   This feature may be useful for
editing applications.  This field is the only way for a reader to
find the image data, and hence must exist.

Note that  either SHORT or LONG values can be used to specify the
strip offsets.   SHORT  values  may be used for small TIFF files.
It should  be noted,  however, that  earlier TIFF  specifications
required LONG strip offsets and that some software may not expect
SHORT values.  See Appendix G for further recommendations.

No default.  See also StripByteCounts, RowsPerStrip.

XResolution
Tag  = 282  (11A)
Type = RATIONAL
N    = 1

The number of pixels per ResolutionUnit in the X direction, i.e.,
in the  ImageWidth direction.   It  is, of  course, not mandatory
that the  image be  actually printed  at the size implied by this
parameter.   It is  up to the application to use this information
as it wishes.

No default.  See also YResolution, ResolutionUnit.

YResolution
Tag  = 283  (11B)
Type = RATIONAL
N    = 1

The number of pixels per ResolutionUnit in the Y direction, i.e.,
in the ImageLength direction.

No default.  See also XResolution, ResolutionUnit.

Informational Fields

Informational  fields   are  fields   that  can   provide  useful
information to  a user,  such as where the image came from.  Most
are ASCII  fields.   An application could have some sort of "More
Info..." dialog box to display such information.

Artist
Tag  = 315  (13B)
Type = ASCII

Person who created the image.

If you need to attach a Copyright notice to an image, this is the
place to  do it.  In fact, you may wish to write out the contents
of the field immediately after the 8-byte TIFF header.  Just make
sure your  IFD and field pointers are set accordingly, and you're
all set.

DateTime
Tag  = 306  (132)
Type = ASCII
N    = 20

Date and  time of  image creation.   Use  the format  "YYYY:MM:DD
HH:MM:SS", with hours on a 24-hour clock, and one space character
between the  date and  the time.    The  length  of  the  string,
including the null, is 20 bytes.

HostComputer
Tag  = 316  (13C)
Type = ASCII

"ENIAC", or whatever.

See also Make, Model, Software.

ImageDescription
Tag  = 270 (10E)
Type = ASCII

For example,  a user  may wish  to attach a comment such as "1988
company picnic" to an image.

It has  been suggested  that  this  is  what  the  newspaper  and
magazine industry calls a "slug."

Make
Tag  = 271  (10F)
Type = ASCII

Manufacturer of the scanner, video digitizer, or whatever.

See also Model, Software.

Model
Tag  = 272  (110)
Type = ASCII

The  model  name/number  of  the  scanner,  video  digitizer,  or
whatever.

This tag is intended for user information only.

See also Make, Software.

Software
Tag  = 305  (131)
Type = ASCII

Name and  release number of the software package that created the
image.

This tag is intended for user information only.

See also Make, Model.

Facsimile Fields

Facsimile fields  may be  useful if  you are  using TIFF to store
facsimile messages  in "raw"  form.  They are not recommended for
use in interchange with desktop publishing applications.

Compression (a basic tag)
Tag  = 259  (103)
Type = SHORT
N    = 1

3 =  Facsimile-compatible CCITT  Group 3, exactly as specified in
"Standardization of  Group 3  facsimile  apparatus  for  document
transmission,"   Recommendation T.4,  Volume VII, Fascicle VII.3,
Terminal Equipment  and Protocols  for  Telematic  Services,  The
International  Telegraph  and  Telephone  Consultative  Committee
(CCITT), Geneva,  1985, pages  16 through  31.   Each strip  must
begin on  a byte  boundary.   (But recall  that an image can be a
single strip.)   Rows  that are  not the first row of a strip are
not required  to begin on a byte boundary.  The data is stored as
bytes,  not words - byte-reversal  is   not  allowed.    See  the
Group3Options field for Group 3 options such as 1D vs 2D coding.

4 =  Facsimile-compatible CCITT  Group 4, exactly as specified in
"Facsimile Coding  Schemes and Coding Control Functions for Group
4 Facsimile Apparatus,"  Recommendation T.6, Volume VII, Fascicle
VII.3, Terminal  Equipment and  Protocols for Telematic Services,
The International  Telegraph and Telephone Consultative Committee
(CCITT), Geneva,  1985, pages  40 through  48.   Each strip  must
begin on  a byte  boundary.  Rows that are not the first row of a
strip are  not required to begin on a byte boundary.  The data is
stored as  bytes, not  words.   See the  Group4Options field  for
Group 4 options.

Group3Options
Tag  = 292  (124)
Type = LONG
N    = 1

See Compression=3.   This  field is  made up  of a set of 32 flag
bits.   Unused bits are expected to be 0.  Bit 0 is the low-order
bit.   It is probably not safe to try to read the file if any bit
of this field is set that you don't know the meaning of.

Bit 0     is 1  for 2-dimensional  coding (else  1-dimensional is
assumed).   For 2-D  coding, if more than one strip is specified,
each strip  must begin  with a  1-dimensionally coded line.  That
is, RowsPerStrip  should be  a multiple  of  "Parameter  K"    as
documented in the CCITT specification.

Bit 1     is 1 if uncompressed mode is used.

Bit 2     is 1  if fill  bits have been added as necessary before
EOL codes  such that  EOL always  ends on  a byte  boundary, thus
ensuring an  eol-sequence of  a 1 byte preceded by a zero nibble:
xxxx-0000 0000-0001.

Default  is   0,  for  basic  1-dimensional  coding.    See  also
Compression.

Group4Options
Tag  =  293  (125)
Type = LONG
N    = 1

See Compression=4.   This  field is  made up  of a set of 32 flag
bits.   Unused bits are expected to be 0.  Bit 0 is the low-order
bit.   It is probably not safe to try to read the file if any bit
of this  field is  set that  you don't know the meaning of.  Gray
scale and color coding schemes are under study, and will be added
when finalized.

For 2-D  coding, each  strip is  encoded as if it were a separate
image.   In particular, each strip begins on a byte boundary; and
the coding  for the first row of a strip is encoded independently
of the  previous row,  using horizontal codes, as if the previous
row is  entirely white.   Each strip ends with the 24-bit end-of-
facsimile block (EOFB).

Bit 0     is unused.
Bit 1     is 1 if uncompressed mode is used.

Default is  0, for  basic 2-dimensional  binary compression.  See
also Compression.

Document Storage and Retrieval Fields

These fields  may be  useful for  document storage  and retrieval
applications.   They are  not recommended  for use in interchange
with desktop publishing applications.

DocumentName
Tag  = 269  (10D)
Type = ASCII

The name of the document from which this image was scanned.

See also PageName.

PageName
Tag  = 285  (11D)
Type = ASCII

The name of the page from which this image was scanned.

See also DocumentName.

No default.

PageNumber
Tag  = 297  (129)
Type = SHORT
N    = 2

This tag is used to specify page numbers of a multiple page (e.g.
facsimile) document.   Two SHORT values are specified.  The first
value is the page number; the second value is the total number of
pages in the document.

Note that  pages need  not appear  in numerical order.  The first
page is 0 (zero).

No default.

XPosition
Tag  = 286  (11E)
Type = RATIONAL

The X  offset of  the left side of the image, with respect to the
left side of the page, in ResolutionUnits.

No default.  See also YPosition.

YPosition
Tag  = 287  (11F)
Type = RATIONAL

The Y  offset of the top of the image, with respect to the top of
the page, in ResolutionUnits.  In the TIFF coordinate scheme, the
positive Y  direction  is  down,  so  that  YPosition  is  always
positive.

No default.  See also XPosition.

No Longer Recommended

These fields  are not  recommended except  perhaps for local use.
They should  not be used for image interchange.  They have either
been superseded  by other fields, have been found to have serious
drawbacks, or are simply not as useful as once thought.  They may
be dropped entirely from a future revision of the specification.

CellLength
Tag  = 265  (109)
Type = SHORT
N    = 1

The length, in 1-bit samples, of the dithering/halftoning matrix.
Assumes that Threshholding = 2.

This field,  plus CellWidth  and Threshholding,  are  problematic
because they  cannot safely be used to reverse-engineer grayscale
image data  out of dithered/halftoned black-and-white data, which
is their  only plausible  purpose.  The only "right" way to do it
is to  not bother  with anything  like these  fields, and instead
write  some  sophisticated  pattern-matching  software  that  can
handle screen  angles that  are not  multiples of 45 degrees, and
other such challenging dithered/halftoned data.

So we  do not  recommend trying  to convert dithered or halftoned
data into  grayscale data.   Dithered  and halftoned data require
careful treatment  to avoid  "stretch marks," but it can be done.
If you  want grayscale images, get them directly from the scanner
or frame grabber or whatever.

No default.  See also Threshholding.

CellWidth
Tag  = 264  (108)
Type = SHORT
N    = 1

The width, in 1-bit samples, of the dithering/halftoning matrix.

No default.   See  also Threshholding.    See  the  comments  for
CellLength.

FillOrder
Tag  = 266  (10A)
Type = SHORT
N    = 1

The order of data values within a byte.
1 = most significant bits of the byte are filled first.  That is,
data values  (or code  words) are  ordered from high order bit to
low order bit within a byte.
2 =  least significant  bits are  filled  first.    Since  little
interest has  been expressed  in least-significant  fill order to
date, and since it is easy and inexpensive for writers to reverse
bit order (use a 256-byte lookup table), we recommend FillOrder=2
for private (non-interchange) use only.

Default is FillOrder = 1.

FreeByteCounts
Tag  = 289  (121)
Type = LONG

For each  "free block"   in  the file, the number of bytes in the
block.

TIFF  readers   can  ignore  FreeOffsets  and  FreeByteCounts  if
present.

FreeOffsets and  FreeByteCounts do  not constitute a remapping of
the logical address space of the file.

Since this  information can  be generated  by scanning  the IFDs,
StripOffsets, and StripByteCounts, FreeByteCounts and FreeOffsets
are not needed.

In addition, it is not clear what should happen if FreeByteCounts
and FreeOffsets exist in more than one IFD.

See also FreeOffsets.

FreeOffsets
Tag  = 288  (120)
Type = LONG

For each "free block"  in the file, its byte offset.

See also FreeByteCounts.

MaxSampleValue
Tag  = 281  (119)
Type = SHORT
N    = SamplesPerPixel

The maximum  used sample  value.    For  example,  if  the  image
consists of  6-bit data  low-order-justified  into  8-bit  bytes,
MaxSampleValue will  be no  greater than 63. This is field is not
to be  used to  affect the  visual appearance  of the  image when
displayed.   Nor should  the values  of  this  field  affect  the
interpretation of  any other  field.    Use  it  for  statistical
purposes only.

Default is 2**(BitsPerSample) - 1.

MinSampleValue
Tag  = 280  (118)
Type = SHORT
N    = SamplesPerPixel

The minimum  used sample  value.  This field is not to be used to
affect the  visual appearance  of the  image when displayed.  See
the comments for MaxSampleValue.

Default is 0.

SubfileType
Tag  = 255  (FF)
Type = SHORT
N    = 1

A general  indication of  the kind  of data  that is contained in
this subfile.  Currently defined values are:

1 =  full  resolution  image  data - ImageWidth, ImageLength, and
StripOffsets are required fields; and
2 =  reduced resolution image data - ImageWidth, ImageLength, and
StripOffsets are  required fields.   It is further assumed that a
reduced resolution  image is  a reduced  version  of  the  entire
extent of the corresponding full resolution data.
3 =  single page  of a  multi-page image  (see the PageNumber tag
description).

Note that several image types can be found in a single TIFF file,
with each subfile described by its own IFD.

No default.

Continued use  of this  field is not recommended.  Writers should
instead use the new and more general NewSubfileType field.

Orientation
Tag  = 274 (112)
Type = SHORT
N    = 1

1 =  The 0th  row represents the visual top of the image, and the
0th column represents the visual left hand side.
2 =  The 0th  row represents the visual top of the image, and the
0th column represents the visual right hand side.
3 =  The 0th  row represents  the visual bottom of the image, and
the 0th column represents the visual right hand side.
4 =  The 0th  row represents  the visual bottom of the image, and
the 0th column represents the visual left hand side.
5 =  The 0th  row represents  the visual  left hand  side of  the
image, and the 0th column represents the visual top.
6 =  The 0th  row represents  the visual  right hand  side of the
image, and the 0th column represents the visual top.
7 =  The 0th  row represents  the visual  right hand  side of the
image, and the 0th column represents the visual bottom.
8 =  The 0th  row represents  the visual  left hand  side of  the
image, and the 0th column represents the visual bottom.

Default is 1.

This field is recommended for private (non-interchange) use only.
It is extremely costly for most readers to perform image rotation
"on the  fly," i.e.,  when importing  and printing;  and users of
most  desktop  publishing  applications  do  not  expect  a  file
imported by the application to be altered permanently in any way.

Threshholding
Tag  = 263  (107)
Type = SHORT
N    = 1

1 = a bilevel "line art"  scan.  BitsPerSample must be 1.
2 =  a "dithered"   scan, usually of continuous tone data such as
photographs. BitsPerSample must be 1.
3 = Error Diffused.

Default is Threshholding = 1.  See also CellWidth, CellLength.
4) Private Fields

An organization  may wish to store information that is meaningful
to only that organization in a TIFF file.  Tags numbered 32768 or
higher  are  reserved  for  that  purpose.    Upon  request,  the
administrator will  allocate and register a block of private tags
for an  organization, to  avoid  possible  conflicts  with  other
organizations.   Tags are  normally allocated  in blocks of five.
If that is not enough, feel free to ask for more. You do not need
to tell  the TIFF administrator or anyone else what you are going
to use them for.

Private enumerated  values  can  be  accommodated  in  a  similar
fashion.   For example,  you may  wish to  experiment with  a new
compression scheme  within TIFF.   Enumeration constants numbered
32768 or  higher are  reserved for  private usage.  Upon request,
the  administrator   will  allocate   and  register  a  block  of
enumerated values  for a  particular field  (Compression, in  our
example), to avoid possible conflicts.

Tags and  values which  are allocated in the private number range
are not  prohibited from  being included  in a future revision of
this specification.   Several  such instances can be found in the
TIFF specification.

Do not  choose your  own tag  numbers.  If you do, it could cause
serious problems some day.

5) Image File Format Issues

In the  quest to  give users no reason NOT to buy a product, some
scanning and  image editing  applications overwhelm users with an
incredible number  of "Save  As..." options.  Let's get rid of as
many of  these as  we possibly  can.   For example, a single TIFF
choice should  suffice once most major readers are supporting the
three TIFF compression schemes; then writers can always compress.
And given  TIFF's flexibility,  including private  tag and  image
editing  support   features,  there  does  not  seem  to  be  any
legitimate reason  for continuing  to  write  image  files  using
proprietary formats.

Along the  same lines,  there is no excuse for making a user have
to know  the file  format of  a file  that is  to be  read by  an
application program.   TIFF  files, as  well as  most other  file
formats, contain  sufficient information  to enable  software  to
automatically and  reliably distinguish  one type  of  file  from
another.

6) For Further Information

Contact the  Aldus Developers' Desk for sample TIFF files, source
code fragments,  and  a  list  of  features  that  are  currently
supported in  Aldus products.   The Aldus Developers' Desk is the
current "TIFF administrator."

Various TIFF  related  aids  are  found  in  Microsoft's  Windows
Developers Tookit for developers writing Windows applications.

Finally, a  number of  scanner vendors are providing various TIFF
services, such  as helping  to distribute  the TIFF specification
and answering  TIFF questions.   Contact  the appropriate product
manager or developer support service group.

Appendix A:  Tag Structure Rationale

A file  format is  defined by  both form (structure) and content.
The content of TIFF consists of definitions of individual fields.
It is therefore the content that we are ultimately interested in.
The structure  merely tells  us how  to find the fields.  Yet the
structure deserves  serious consideration for a number of reasons
that are not at all obvious at first glance.  Since the structure
described  herein   departs  significantly   from  several  other
approaches, it may be useful to discuss the rationale behind it.

The simplest,  most straightforward  structure for something like
an image  file is  a positional  format.  In a positional scheme,
the location  of the  data defines  what the  data  means.    For
example, the  field for  "number of  rows" might  begin  at  byte
offset 30 in the image file.

This approach  is simple and easy to implement and is perfect for
static environments.   But  if a  significant amount  of  ongoing
change must  be accommodated,  subtle problems  begin to  appear.
For example,  suppose that  a field  must be superseded by a new,
more general  field.  You could bump a version number to flag the
change.   Then  new  software  has  no  problem  doing  something
sensible with  old data, and all old software will reject the new
data, even  software that  didn't care about the old field.  This
may seem like no more than a minor annoyance at first glance, but
causing old  software to  break more  often than  it would really
need to  can be very costly and, inevitably, causes much gnashing
of teeth among customers.

Furthermore, it  can be  avoided.   One approach  is to  store  a
"valid" flag  bit for each field.  Now you don't have to bump the
version number,  as long  as you  can put the new field somewhere
that doesn't  disturb any  of the  old fields.  Old software that
didn't care about that old field anyway can continue to function.
(Old software  that did  care will of course have to give up, but
this is an unavoidable price to be paid for the sake of progress,
barring total omniscience.)

Another problem  that crops  up frequently is that certain fields
are likely  to make  sense only  if  other  fields  have  certain
values.   This is not such a serious problem in practice; it just
makes things  more confusing.   Nevertheless,  we note  that  the
"valid" flag bits described in the previous paragraph can help to
clarify the situation.

Field-dumping  programs   can  be  very  helpful  for  diagnostic
purposes.   A desirable  characteristic of such a program is that
it doesn't  have to  know much  about what  it is  dumping.    In
particular, it would be nice if the program could dump ASCII data
in ASCII  format, integer  data in  integer format,  and  so  on,
without having  to teach  the program  about new  fields all  the
time.   So maybe  we should  add a  "data type"  component to our
fields, plus  information on  how long  the field is, so that our
dump program can walk through the fields without knowing what the
fields "mean."

But note  that if we add one more component to each field, namely
a tag  that tells  what the field means, we can dispense with the
"valid" flag  bits, and  we can  also avoid  wasting space on the
non-valid fields in the file.  Simple image creation applications
can write out several fields and be done.

We have  now derived  the essentials  of a  tag-based image  file
format.

Finally, a  caveat.  A tag based scheme cannot guarantee painless
growth.   But is  does provide  a useful  tool to  assist in  the
process.

Appendix B:  Data Compression - Scheme 2

Abstract

This document  describes a  method for  compressing bilevel  data
that is  based on  the CCITT  Group 3  1D  facsimile  compression
scheme.

References

1.   "Standardization of Group 3 facsimile apparatus for document
transmission," Recommendation  T.4, Volume  VII, Fascicle  VII.3,
Terminal Equipment  and Protocols  for  Telematic  Services,  The
International  Telegraph  and  Telephone  Consultative  Committee
(CCITT), Geneva, 1985, pages 16 through 31.
2.   "Facsimile Coding  Schemes and  Coding Control Functions for
Group 4  Facsimile Apparatus,"  Recommendation T.6,  Volume  VII,
Fascicle VII.3,  Terminal Equipment  and Protocols  for Telematic
Services, The  International Telegraph and Telephone Consultative
Committee (CCITT), Geneva, 1985, pages 40 through 48.

We do  not believe that these documents are necessary in order to
implement Compression=2.   We  have included  (verbatim  in  most
places) all the pertinent information in this Appendix.  However,
if you  wish to  order the  documents, you  can  write  to  ANSI,
Attention: Sales,  1430 Broadway, New York, N.Y., 10018.  Ask for
the publication listed above -it contains both Recommendation T.4
and T.6.

Relationship to the CCITT Specifications

The  CCITT   Group  3   and  Group   4  specifications   describe
communications protocols for a particular class of devices.  They
are not  by themselves sufficient to describe a disk data format.
Fortunately, however,  the CCITT  coding schemes  can be  readily
adapted to this different environment.  The following is one such
adaptation.   Most of  the language  is copied  directly from the
CCITT specifications.

Coding Scheme

A line  (row) of  data is composed of a series of variable length
code words.  Each code word represents a run length of either all
white or  all black.   (Actually,  more than one code word may be
required to  code a  given run,  in a  manner  described  below.)
White runs and black runs alternate.

In order  to ensure  that the  receiver (decompressor)  maintains
color synchronization, all data lines will begin with a white run
length code  word set.   If  the actual  scan line  begins with a
black run,  a white  run length  of zero  will be sent (written).
Black or  white run  lengths are  defined by  the code  words  in
Tables 1  and 2.   The  code words are of two types:  Terminating
code  words   and  Make-up  code  words.    Each  run  length  is
represented by  zero or  more  Make-up  code  words  followed  by
exactly one Terminating code word.

Run lengths  in the  range of  0 to  63 pels (pixels) are encoded
with their appropriate Terminating code word.  Note that there is
a different list of code words for black and white run lengths.

Run lengths in the range of 64 to 2623 (2560+63) pels are encoded
first by  the Make-up  code word representing the run length that
is nearest  to, not  longer than,  that required.   This  is then
followed by the Terminating code word representing the difference
between the required run length and the run length represented by
the Make-up code.

Run lengths  in the range of lengths longer than or equal to 2624
pels are  coded first  by the  Make-up code  of  2560.    If  the
remaining part  of the run (after the first Make-up code of 2560)
is 2560  pels or  greater, additional Make-up code(s) of 2560 are
issued until the remaining part of the run becomes less than 2560
pels.   Then  the  remaining  part  of  the  run  is  encoded  by
Terminating code  or  by  Make-up  code  plus  Terminating  code,
according to the range mentioned above.

It is  considered an  unrecoverable error  if the  sum of the run
lengths for  a line  does not  equal the  value of the ImageWidth
field.

New rows always begin on the next available byte boundary.

No EOL  code words  are used.   No fill bits are used, except for
the ignored  bits at  the end  of the last byte of a row.  RTC is
not used.

Table 1/T.4  Terminating codes

White          Black
 run Code  run Code
length    word length    word
 ----     ---- ------    ----

 0   00110101   0   0000110111
 1   000111     1   010
 2   0111  2   11
 3   1000  3   10
 4   1011  4   011
 5   1100  5   0011
 6   1110  6   0010
 7   1111  7   00011
 8   10011      8   000101
 9   10100      9   000100
10   00111     10   0000100
11   01000     11   0000101
12   001000    12   0000111
13   000011    13   00000100
14   110100    14   00000111
15   110101    15   000011000
16   101010    16   0000010111
17   101011    17   0000011000
18   0100111   18   0000001000
19   0001100   19   00001100111
20   0001000   20   00001101000
21   0010111   21   00001101100
22   0000011   22   00000110111
23   0000100   23   00000101000
24   0101000   24   00000010111
25   0101011   25   00000011000
26   0010011   26   000011001010
27   0100100   27   000011001011
28   0011000   28   000011001100
29   00000010  29   000011001101
30   00000011  30   000001101000
31   00011010  31   000001101001
32   00011011  32   000001101010
33   00010010  33   000001101011
34   00010011  34   000011010010
35   00010100  35   000011010011
36   00010101  36   000011010100
37   00010110  37   000011010101
38   00010111  38   000011010110
39   00101000  39   000011010111
40   00101001  40   000001101100
41   00101010  41   000001101101
42   00101011  42   000011011010
43   00101100  43   000011011011
44   00101101  44   000001010100
45   00000100  45   000001010101
46   00000101  46   000001010110
47   00001010  47   000001010111
48   00001011  48   000001100100
49   01010010  49   000001100101
50   01010011  50   000001010010
51   01010100  51   000001010011
52   01010101  52   000000100100
53   00100100  53   000000110111
54   00100101  54   000000111000
55   01011000  55   000000100111
56   01011001  56   000000101000
57   01011010  57   000001011000
58   01011011  58   000001011001
59   01001010  59   000000101011
60   01001011  60   000000101100
61   00110010  61   000001011010
62   00110011  62   000001100110
63   00110100  63   000001100111

Table 2/T.4  Make-up codes

White          Black
 run Code  run Code
length    word      length    word
------    ---- ------    ----

  64 11011       64 0000001111
 128 10010      128 000011001000
 192 010111     192 000011001001
 256 0110111    256 000001011011
 320 00110110   320 000000110011
 384 00110111   384 000000110100
 448 01100100   448 000000110101
 512 01100101   512 0000001101100
 576 01101000   576 0000001101101
 640 01100111   640 0000001001010
 704 011001100  704 0000001001011
 768 011001101  768 0000001001100
 832 011010010  832 0000001001101
 896 011010011  896 0000001110010
 960 011010100  960 0000001110011
1024 011010101 1024 0000001110100
1088 011010110 1088 0000001110101
1152 011010111 1152 0000001110110
1216 011011000 1216 0000001110111
1280 011011001 1280 0000001010010
1344 011011010 1344 0000001010011
1408 011011011 1408 0000001010100
1472 010011000 1472 0000001010101
1536 010011001 1536 0000001011010
1600 010011010 1600 0000001011011
1664 011000    1664 0000001100100
1728 010011011 1728 0000001100101
 EOL 000000000001    EOL 000000000001

Additional make-up codes

White
and
Black     Make-up
run  code
length    word
------    ----

1792 00000001000
1856 00000001100
1920 00000001101
1984 000000010010
2048 000000010011
2112 000000010100
2176 000000010101
2240 000000010110
2304 000000010111
2368 000000011100
2432 000000011101
2496 000000011110
2560 000000011111

Appendix C: Data Compression - Scheme 32773 -
"PackBits"

Abstract

This document  describes a  simple compression scheme for bilevel
scanned and paint type files.

Motivation

The TIFF  specification defines  a number of compression schemes.
Compression type  1 is  really no  compression, other  than basic
pixel  packing.     Compression   type  2,   based  on  CCITT  1D
compression,  is   powerful,  but   not  trivial   to  implement.
Compression type  5 is  typically very effective for most bilevel
images, as  well as  many deeper images such as palette color and
grayscale images, but is also not trivial to implement.  PackBits
is a simple but often effective alternative.

Description

Several good schemes were already in use in various settings.  We
somewhat arbitrarily picked the Macintosh PackBits scheme.  It is
byte oriented,  so there  is no problem with word alignment.  And
it has a good worst case behavior (at most 1 extra byte for every
128 input  bytes).    For  Macintosh  users,  there  are  toolbox
utilities PackBits  and UnPackBits that will do the work for you,
but it is easy to implement your own routines.

A pseudo code fragment to unpack might look like this:

Loop  until  you  get  the  number  of  unpacked  bytes  you  are
expecting:
     Read the next source byte into n.
     If n is between 0 and 127 inclusive, copy the next n+1 bytes
literally.
     Else if  n is  between -127  and -1 inclusive, copy the next
byte -n+1 times.
     Else if n is 128, noop.
Endloop

In the  inverse routine,  it's best to encode a 2-byte repeat run
as a replicate run except when preceded and followed by a literal
run, in  which case it's best to merge the three into one literal
run.  Always encode 3-byte repeats as replicate runs.

So that's the algorithm.  Here are some other rules:

o    Each row  must be packed separately.  Do not compress across
row boundaries.

o    The number  of uncompressed  bytes per  row is defined to be
(ImageWidth +  7) / 8.  If the uncompressed bitmap is required to
have an  even number  of bytes  per row,  decompress  into  word-
aligned buffers.
o    If a  run is  larger  than  128  bytes,  simply  encode  the
remainder of the run as one or more additional replicate runs.

When  PackBits   data  is  uncompressed,  the  result  should  be
interpreted as per compression type 1 (no compression).

Appendix D

Appendix D  has been  deleted.   It formerly contained guidelines
for passing  TIFF files on the Microsoft Windows Clipboard.  This
was judged to not be a good idea, in light of the ever-increasing
size of  scanned images.   Applications are instead encouraged to
employ file-based  mechanisms to  exchange  TIFF  data.    Aldus-
PageMaker, for  example, implements  a "File  Place"  command  to
allow TIFF files to be imported.

Appendix E:  Numerical List of TIFF Tags

NewSubfileType
Tag  =  254  (FE)
Type = LONG
N    = 1

SubfileType
Tag  = 255  (FF)
Type = SHORT
N    = 1

ImageWidth
Tag  = 256  (100)
Type = SHORT or LONG
N    = 1

ImageLength
Tag  = 257  (101)
Type = SHORT or LONG
N    = 1

BitsPerSample
Tag  = 258  (102)
Type = SHORT
N    = SamplesPerPixel

Compression
Tag  = 259  (103)
Type = SHORT
N    = 1

PhotometricInterpretation
Tag  = 262  (106)
Type = SHORT
N    = 1

Threshholding
Tag  = 263  (107)
Type = SHORT
N    = 1

CellWidth
Tag  = 264  (108)
Type = SHORT
N    = 1

CellLength
Tag  = 265  (109)
Type = SHORT
N    = 1

FillOrder
Tag  = 266  (10A)
Type = SHORT
N    = 1

DocumentName
Tag  = 269  (10D)
Type = ASCII

ImageDescription
Tag  = 270 (10E)
Type = ASCII

Make
Tag  = 271  (10F)
Type = ASCII

Model
Tag  = 272  (110)
Type = ASCII

StripOffsets
Tag  = 273  (111)
Type = SHORT or LONG
N    = StripsPerImage for PlanarConfiguration equal to 1.
     = SamplesPerPixel  * StripsPerImage  for PlanarConfiguration
equal to 2

Orientation
Tag  = 274 (112)
Type = SHORT
N    = 1

SamplesPerPixel
Tag  = 277  (115)
Type = SHORT
N    = 1

RowsPerStrip
Tag  = 278  (116)
Type = SHORT or LONG
N    = 1

StripByteCounts
Tag  = 279  (117)
Type = LONG or SHORT
N    = StripsPerImage for PlanarConfiguration equal to 1.
     = SamplesPerPixel  * StripsPerImage  for PlanarConfiguration
equal to 2.

MinSampleValue
Tag  = 280  (118)
Type = SHORT
N    = SamplesPerPixel

MaxSampleValue
Tag  = 281  (119)
Type = SHORT
N    = SamplesPerPixel

XResolution
Tag  = 282  (11A)
Type = RATIONAL
N    = 1

YResolution
Tag  = 283  (11B)
Type = RATIONAL
N    = 1

PlanarConfiguration
Tag  = 284  (11C)
Type = SHORT
N    = 1

PageName
Tag  = 285  (11D)
Type = ASCII

XPosition
Tag  = 286  (11E)
Type = RATIONAL

YPosition
Tag  = 287  (11F)
Type = RATIONAL

FreeOffsets
Tag  = 288  (120)
Type = LONG

FreeByteCounts
Tag  = 289  (121)
Type = LONG

GrayResponseUnit
Tag  = 290 (122)
Type = SHORT
N    = 1

GrayResponseCurve
Tag  = 291 (123)
Type = SHORT
N    = 2**BitsPerSample

Group3Options
Tag  = 292  (124)
Type = LONG
N    = 1

Group4Options
Tag  =  293  (125)
Type = LONG
N    = 1

ResolutionUnit
Tag  = 296 (128)
Type = SHORT
N    = 1

PageNumber
Tag  = 297  (129)
Type = SHORT
N    = 2

ColorResponseCurves
Tag  = 301 (12D)
Type = SHORT
N    = 3 * (2**BitsPerSample)

Software
Tag  = 305  (131)
Type = ASCII

DateTime
Tag  = 306  (132)
Type = ASCII
N    = 20

Artist
Tag  = 315  (13B)
Type = ASCII

HostComputer
Tag  = 316  (13C)
Type = ASCII

Predictor
Tag  = 317 (13D)
Type = SHORT
N    = 1

WhitePoint
Tag  = 318 (13E)
Type = RATIONAL
N    = 2

PrimaryChromaticities
Tag  = 319 (13F)
Type = RATIONAL
N    = 6

ColorMap
Tag  = 320 (140)
Type = SHORT
N    = 3 * (2**BitsPerSample)

Appendix F:  Data Compression - Scheme 5 - LZW
Compression

Abstract

This document describes an adaptive compression scheme for raster
images.

Reference

Terry  A.   Welch,  "A   Technique  for   High  Performance  Data
Compression",  IEEE   Computer,  vol.   17  no.  6  (June  1984).
Describes the  basic Lempel-Ziv  & Welch  (LZW) algorithm.    The
author's goal  in the  article is  to describe  a  hardware-based
compressor that could be built into a disk controller or database
engine, and  used on  all types  of data.   There  is no specific
discussion of  raster images.    We  intend  to  give  sufficient
information in  this Appendix so that the article is not required
reading.

Requirements

A compression  scheme with  the following  characteristics should
work well in a desktop publishing environment:

o    Must work well for images of any bit depth, including images
deeper than 8 bits per sample.
o    Must be effective:  an average compression ratio of at least
2:1 or  better.    And  it  must  have  a  reasonable  worst-case
behavior, in case something really strange is thrown at it.
o    Should  not  depend  on  small  variations  between  pixels.
Palette color  images tend  to contain  abrupt changes  in  index
values, due to common patterning and dithering techniques.  These
abrupt changes  do tend to be repetitive, however, and the scheme
should make use of this fact.
o    For images  generated by  paint programs,  the scheme should
not depend on a particular pattern width.  8x8 pixel patterns are
common now, but we should not assume that this situation will not
change.
o    Must be  fast.   It should  not take  more than 5 seconds to
decompress a  100K byte  grayscale image on a 68020- or 386-based
computer.   Compression can  be slower,  but probably not by more
than a factor of 2 or 3.
o    The level  of implementation  complexity must be reasonable.
We would like something that can be implemented in no more than a
couple of  weeks  by  a competent  software  engineer  with  some
experience  in   image  processing.     The   compiled  code  for
compression and  decompression combined  should be  no more  than
about 10K.
o    Does not require floating point software or hardware.

The following  sections describe  an algorithm based on the "LZW"
(Lempel-Ziv & Welch) technique that meets the above requirements.
In addition  meeting our  requirements,  LZW  has  the  following
characteristics:

o    LZW is fully reversible.  All information is preserved.  But
if noise  or information  is removed  from an  image, perhaps  by
smoothing or  zeroing some  low-order bitplanes,  LZW  compresses
images to  a smaller  size.   Thus,   5-bit, 6-bit, or 7-bit data
masquerading as  8-bit data  compresses better  than  true  8-bit
data. Smooth  images also  compress better than noisy images, and
simple images compress better than complex images.
o    On a  68082- or  386-based computer,  LZW  software  can  be
written to  compress at  between 30K  and 80K  bytes per  second,
depending on image characteristics.  LZW decompression speeds are
typically about 50K bytes per second.
o    LZW works  well on  bilevel images,  too.   It always  beats
PackBits,  and   generally  ties   CCITT  1D  (Modified  Huffman)
compression, on our test images.  Tying CCITT 1D is impressive in
that LZW  seems to be considerably faster than CCITT 1D, at least
in our implementation.
o    Our implementation is written in C, and compiles to about 2K
bytes of object code each for the compressor and decompressor.
o    One of  the nice  things about  LZW is that it is used quite
widely in  other applications  such as  archival programs, and is
therefore more of a known quantity.

The Algorithm

Each strip  is compressed  independently.   We strongly recommend
that RowsPerStrip  be chosen  such that each strip contains about
8K bytes  before compression.   We  want to keep the strips small
enough so  that the  compressed and  uncompressed versions of the
strip can  be kept entirely in memory even on small machines, but
large enough to maintain nearly optimal compression ratios.

The LZW  algorithm is  based on  a translation  table, or  string
table, that  maps strings  of input  characters into  codes.  The
TIFF implementation  uses variable-length  codes, with  a maximum
code length of 12 bits.  This string table is different for every
strip, and,  remarkably, does  not need to be kept around for the
decompressor.     The  trick   is  to   make   the   decompressor
automatically build  the same  table as is built when compressing
the data.   We  use a  C-like pseudocode  to describe  the coding
scheme:

     InitializeStringTable();
     WriteCode(ClearCode);
     Omega = the empty string;
     for each character in the strip {
          K = GetNextCharacter();
          if Omega+K is in the string table {
               Omega = Omega+K;  /* string concatenation */
          } else {
               WriteCode (CodeFromString(Omega));
               AddTableEntry(Omega+K);
               Omega = K;
          }
     } /* end of for loop */
     WriteCode (CodeFromString(Omega));
     WriteCode (EndOfInformation);

That's  it.    The  scheme  is  simple,  although  it  is  fairly
challenging  to  implement  efficiently.    But  we  need  a  few
explanations before we go on to decompression.

The  "characters"   that  make  up  the  LZW  strings  are  bytes
containing TIFF  uncompressed (Compression=1)  image data, in our
implementation.   For example,  if BitsPerSample is 4, each 8-bit
LZW character will contain two 4-bit pixels.  If BitsPerSample is
16, each 16-bit pixel will span two 8-bit LZW characters.

(It is  also possible to implement a version of LZW where the LZW
character depth equals BitsPerSample, as was described by Draft 2
of Revision  5.0.   But  there  is  a  major  problem  with  this
approach.   If BitsPerSample  is greater  than 11, we can not use
12-bit-maximum  codes,   so  that  the  resulting  LZW  table  is
unacceptably large.   Fortunately,  due to the adaptive nature of
LZW, we  do not  pay a  significant compression ratio penalty for
combining several  pixels into  one byte before compressing.  For
example, our  4-bit sample  images  compressed  about  3  percent
worse, and  our 1-bit  images compressed  about 5 percent better.
And it  is easier to write an LZW compressor that always uses the
same character  depth than  it is  to write  one which can handle
varying depths.)

We can  now describe  some of the routine and variable references
in our pseudocode:

InitializeStringTable() initializes  the string  table to contain
all possible  single-character strings.   There  are 256 of them,
numbered 0 through 255, since our characters are bytes.

WriteCode() writes  a code  to the output stream.  The first code
written is a Clear code, which is defined to be code #256.

Omega is our "prefix string."

GetNextCharacter() retrieves  the next  character value  from the
input stream.   This  will be number between 0 and 255, since our
characters are bytes.

The "+" signs indicate string concatenation.

AddTableEntry() adds a table entry.  (InitializeStringTable() has
already put  256 entries  in our table.  Each entry consists of a
single-character string, and its associated code value, which is,
in our  application, identical to the character itself.  That is,
the 0th  entry in  our table  consists of  the string  <0>,  with
corresponding code  value of  <0>, the  1st entry  in  the  table
consists of the string <1>, with corresponding code value of <1>,
..., and  the 255th  entry in  our table  consists of  the string
<255>, with  corresponding code  value of  <255>.)   So the first
entry that  we add  to our  string table will be at position 256,
right?   Well, not  quite, since  we will reserve code #256 for a
special   "Clear"   code,   and   code   #257   for   a   special
"EndOfInformation" code  that we will write out at the end of the
strip.  So the first multiple-character entry added to the string
table will be at position 258.

Let's try  an example.   Suppose  we have  input data  that looks
like:

Pixel 0:  <7>
Pixel 1:  <7>
Pixel 2:  <7>
Pixel 3:  <8>
Pixel 4:  <8>
Pixel 5:  <7>
Pixel 6:  <7>
Pixel 7:  <6>
Pixel 8:  <6>

First, we read Pixel 0 into K.  OmegaK is then simply <7>, since Omega is
the empty string at this point.  Is the string <7> already in the
string table?  Of course, since all single character strings were
put in the table by InitializeStringTable().  So set Omega equal to
<7>, and go to the top of the loop.

Read Pixel 1 into K.  Does OmegaK (<7><7>) exist in the string table?
No, so we get to do some real work.  We write the code associated
with Omega to output (write <7> to output), and add OmegaK (<7><7>) to
the table as entry 258.   Store K (<7>) into Omega.    Note  that
although we have added the string consisting of Pixel 0 and Pixel
1 to  the table, we "re-use" Pixel 1 as the beginning of the next
string.

Back at the top of the loop.  We read Pixel 2 into K.  Does OmegaK
(<7><7>) exist  in the  string table?   Yes,  the entry  we  just
added, entry 258, contains exactly <7><7>.  So we just add K onto
the end of Omega, so that Omega is now <7><7>.

Back at the top of the loop.  We read Pixel 3 into K.  Does OmegaK
(<7><7><8>) exist  in the  string table?   No,  so write the code
associated with Omega (<258>) to output, and add OmegaK to the table as
entry 259.  Store K (<8>) into Omega.

Back at the top of the loop.  We read Pixel 4 into K.  Does OmegaK
(<8><8>) exist  in the  string table?   No,  so  write  the  code
associated with Omega (<8>) to output, and add OmegaK to the table as
entry 260.  Store K (<8>) into Omega.

Continuing, we get the following results:

     After reading: We write to output: And add table entry:
     Pixel 0
     Pixel 1   <7>  258: <7><7>
     Pixel 2
     Pixel 3   <258>     259: <7><7><8>
     Pixel 4   <8>  260: <8><8>
     Pixel 5   <8>  261: <8><7>
     Pixel 6
     Pixel 7   <258>     262: <7><7><6>
     Pixel 8   <6>  263: <6><6>

WriteCode() also  requires some  explanation.   The  output  code
stream,  <7><258><8><8><258><6>...  in  our  example,  should  be
written using as few bits as possible.  When we are just starting
out, we  can use  9-bit codes, since our new string table entries
are greater  than 255  but less  than 512.  But when we add table
entry 512,  we must  switch to 10-bit codes.  Likewise, we switch
to 11-bit  codes at  1024, and  12-bit codes  at 2048.   We  will
somewhat arbitrarily limit ourselves to 12-bit codes, so that our
table can  have at most 4096 entries.  If we push it any farther,
tables tend to get too large.

What happens  if we run out of room in our string table?  This is
where the afore-mentioned Clear code comes in.  As soon as we use
entry 4094, we write out a (12-bit) Clear code.   (If we wait any
longer to  write the  Clear code,  the decompressor  might try to
interpret the  Clear code  as a 13-bit code.)  At this point, the
compressor re-initializes the string table and starts writing out
9-bit codes again.

Note that whenever you write a code and add a table entry, Omega is
not left  empty.   It contains exactly one character.  Be careful
not to  lose it  when you  write an end-of-table Clear code.  You
can either write it out as a 12-bit code before writing the Clear
code, in  which case  you will  want to  do it right after adding
table entry  4093, or  after the  clear code  as  a  9-bit  code.
Decompression gives the same result in either case.

To make  things a  little simpler  for the  decompressor, we will
require that  each strip  begins with a Clear code, and ends with
an EndOfInformation code.

Every LZW-compressed  strip must  begin on  a byte  boundary.  It
need not  begin on  a word  boundary.   LZW compression codes are
stored into  bytes in  high-to-low-order fashion, i.e., FillOrder
is assumed  to be  1.  The compressed codes are written as bytes,
not  words,  so  that  the  compressed  data  will  be  identical
regardless of whether it is an "II" or "MM" file.

Note that  the LZW string table is a continuously updated history
of the  strings that  have been encountered in the data.  It thus
reflects the characteristics of the data, providing a high degree
of adaptability.

LZW Decoding

The procedure for decompression is a little more complicated, but
still not too bad:

     while ((Code = GetNextCode()) != EoiCode) {
          if (Code == ClearCode) {
               InitializeTable();
               Code = GetNextCode();
               if (Code == EoiCode)
                    break;
               WriteString(StringFromCode(Code));
               OldCode = Code;
          }  /* end of ClearCode case */

          else {
               if (IsInTable(Code)) {
                    WriteString(StringFromCode(Code));
                    AddStringToTable(StringFromCode(OldCode)+
  	FirstChar(StringFromCode(Code)));
                    OldCode = Code;
               } else {
                    OutString = StringFromCode(OldCode) +
            FirstChar(StringFromCode(OldCode));
                    WriteString(OutString);
                    AddStringToTable(OutString);
                    OldCode = Code;
               }
          } /* end of not-ClearCode case */
     } /* end of while loop */

The function  GetNextCode() retrieves the next code from the LZW-
coded data.  It must keep track of bit boundaries.  It knows that
the first code that it gets will be a 9-bit code.  We add a table
entry each  time we get a code, so GetNextCode() must switch over
to 10-bit codes as soon as string #511 is stored into the table.

The function  StringFromCode() gets  the string associated with a
particular code from the string table.

The function  AddStringToTable() adds  a  string  to  the  string
table.   The "+"  sign joining  the two  parts of the argument to
AddStringToTable indicate string concatenation.

StringFromCode() looks  up the  string associated  with  a  given
code.

WriteString() adds a string to the output stream.

When SamplesPerPixel Is Greater Than 1

We  have   so  far   described  the   compression  scheme  as  if
SamplesPerPixel were  always 1,  as will  be  be  the  case  with
palette color  and grayscale  images.  But what do we do with RGB
image data?

Tests on  our sample  images indicate  that the  LZW  compression
ratio    is    nearly    identical    regardless    of    whether
PlanarConfiguration=1 or  PlanarConfiguration=2, for  RGB images.
So use  whichever configuration  you prefer,  and simply compress
the bytes in the strip.

It is  worth cautioning  that compression  ratios on our test RGB
images were disappointing low: somewhere between 1.1 to 1 and 1.5
to 1,  depending on the image.  Vendors are urged to do what they
can to  remove as  much noise  from  their  images  as  possible.
Preliminary tests  indicate that significantly better compression
ratios are  possible with  less noisy  images.  Even something as
simple as  zeroing out one or two least-significant bitplanes may
be  quite   effective,  with   little  or  no  perceptible  image
degradation.

Implementation

The exact  structure of  the string  table and the method used to
determine if  a string  is already  in the table are probably the
most significant  design decisions in the implementation of a LZW
compressor and  decompressor.   Hashing has  been suggested  as a
useful technique for the compressor.  We have chosen a tree based
approach, with  good results.   The decompressor is actually more
straightforward,  as   well  as   faster,  since   no  search  is
involved - strings can be accessed directly by code value.

Performance

Many  people   do  not   realize  that  the  performance  of  any
compression scheme  depends greatly  on the type of data to which
it is  applied.   A scheme that works well on one data set may do
poorly on the next.

But since  we do  not want  to burden  the world  with  too  many
compression schemes, an adaptive scheme such as LZW that performs
quite well  on a wide range of images is very desirable.  LZW may
not always  give optimal  compression ratios,  but  its  adaptive
nature and relative simplicity seem to make it a good choice.

Experiments thus  far indicate  that we  can  expect  compression
ratios of  between 1.5  and 3.0  to 1  from LZW,  with no loss of
data, on  continuous tone  grayscale scanned  images.  If we zero
the least  significant one or two bitplanes of 8-bit data, higher
ratios can be achieved.  These bitplanes often consist chiefly of
noise, in  which case  little or no loss in image quality will be
perceived.   Palette color  images created  in  a  paint  program
generally compress  much  better  than  continuous  tone  scanned
images, since paint images tend to be more repetitive.  It is not
unusual to  achieve compression  ratios of 10 to 1 or better when
using LZW on palette color paint images.

By way  of comparison, PackBits, used in TIFF for black and white
bilevel images, does not do well on color paint images, much less
continuous tone  grayscale and  color images.  1.2 to 1 seemed to
be about average for 4-bit images, and 8-bit images are worse.

It has  been suggested that the CCITT 1D scheme could be used for
continuous tone  images, by compressing each bitplane separately.
No doubt  some  compression  could  be  achieved,  but  it  seems
unlikely that  a scheme  based on a fixed table that is optimized
for short  black runs  separated by  longer white runs would be a
very good choice on any of the bitplanes.  It would do quite well
on the  high-order bitplanes  (but so would a simpler scheme like
PackBits), and  would do quite poorly on the low-order bitplanes.
We believe  that the  compression ratios  would generally  not be
very impressive, and the process would in addition be quite slow.
Splitting  the  pixels  into  bitplanes  and  putting  them  back
together is  somewhat expensive,  and the  coding is  also fairly
slow when implemented in software.

Another  approach   that  has  been  suggested  uses  uses  a  2D
differencing step  following by  coding the  differences using  a
fixed table  of variable-length codes.  This type of scheme works
quite well  on many  8-bit  grayscale  images,  and  is  probably
simpler  to  implement  than  LZW.    But  it  has  a  number  of
disadvantages when  used on  a wide variety of images.  First, it
is not  adaptive.   This makes  a big difference when compressing
data such as 8-bit images that have been "sharpened" using one of
the standard  techniques.  Such images tend to get larger instead
of smaller  when  compressed.    Another  disadvantage  of  these
schemes is  that they  do not  do well  with a  wide range of bit
depths.   The built-in  code table  has to  be  optimized  for  a
particular bit depth in order to be effective.

Finally,  we   should  mention   "lossy"   compression   schemes.
Extensive research  has been  done in  the area of lossy, or non-
information-preserving  image   compression.    These  techniques
generally yield  much  higher  compression  ratios  than  can  be
achieved  by   fully-reversible,   information-preserving   image
compression  techniques   such  as   PackBits  and   LZW.    Some
disadvantages:     many  of   the   lossy   techniques   are   so
computationally expensive  that hardware  assists  are  required.
Others  are  so  complicated  that  most  microcomputer  software
vendors could  not afford either the expense of implementation or
the increase  in  application  object  code  size.    Yet  others
sacrifice enough  image  quality  to  make  them  unsuitable  for
publishing use.

In spite  of these  difficulties, we  believe that there will one
day be  a standardized  lossy compression  scheme for  full color
images  that  will  be  usable  for  publishing  applications  on
microcomputers.   An International  Standards Organization group,
ISO/IEC/JTC1/SC2/WG8, in cooperation with CCITT Study Group VIII,
is hard at work on a scheme that might be appropriate.  We expect
that a  future revision of TIFF will incorporate this scheme once
it is  finalized, if it turns out to satisfy the needs of desktop
publishers and  others in the microcomputer community.  This will
augment, not replace, LZW as an approved TIFF compression scheme.
LZW will  very likely  remain the  scheme of  choice for  Palette
color images,  and perhaps  4-bit grayscale  images, and may well
overtake CCITT 1D and PackBits for bilevel images.

Future LZW Extensions

Some images  compress better  using LZW  coding if they are first
subjected to  a process  wherein each  pixel value is replaced by
the  difference  between  the  pixel  and  the  preceding  pixel.
Performing this  differencing in two dimensions helps some images
even more.  However, many images do not compress better with this
extra preprocessing,  and for a significant number of images, the
compression ratio is actually worse.  We are therefore not making
differencing an integral part of the TIFF LZW compression scheme.

However,  it   is  possible   that  a   "prediction"  stage  like
differencing may  exist which  is effective over a broad range of
images.  If such a scheme is found, it may be incorporated in the
next major TIFF revision.  If so, a new value will be defined for
the new  "Predictor" TIFF  tag.  Therefore, all TIFF readers that
read LZW files must pay attention to the Predictor tag.  If it is
1, which  is the  default case,  LZW  decompression  may  proceed
safely.   If it  is not  1, and the reader does not recognize the
specified prediction scheme, the reader should give up.

Acknowledgements

The original  LZW reference  has already  been given.  The use of
ClearCode as a technique to handle overflow was borrowed from the
compression scheme used by the Graphics Interchange Format (GIF),
a small-color-paint-image-file  format used  by  CompuServe  that
also is an adaptation of the LZW technique.  Joff Morgan and Eric
Robinson of  Aldus were  each instrumental  in their  own way  in
getting LZW off the ground.

Appendix G: TIFF Classes

Rationale

TIFF was  designed to  make  life  easier  for  scanner  vendors,
desktop publishing  software developers,  and users  of these two
classes of products, by reducing the proliferation of proprietary
scanned  image   formats.    It  has  succeeded  far  beyond  our
expectations in  this respect.   But  we had  expected that  TIFF
would be of interest to only a dozen or so scanner vendors (there
weren't any  more than  that in  1985), and  another dozen  or so
desktop publishing  software vendors.   This  turned out  to be a
gross underestimate.   The only problem with this sort of success
is that  TIFF was  designed to  be powerful  and flexible, at the
expense of  simplicity.   It takes  a fair  amount of  effort  to
handle all  the options  currently defined  in this specification
(probably no  application does  a  complete  job),  and  that  is
currently the  only way  you can be sure that you will be able to
import any  TIFF image,  since there are so many image-generating
applications out there now.

So here  is an attempt to channel some of the flexibility of TIFF
into more  restrictive paths,  using what  we have learned in the
past two  years about which options are the most useful.  Such an
undertaking is  of course filled with fairly arbitrary decisions.
But the  result is  that writers can more easily write files that
will be  successfully read by a wide variety of applications, and
readers can know when they can stop adding TIFF features.

The price  we pay for TIFF Classes is some loss in the ability to
adapt.   Once we  establish the requirements for a TIFF Class, we
can never add new requirements, since old software would not know
about these  new requirements.  (The best we can do at that point
is establish new TIFF Classes.  But the problem with that is that
we could quickly have too many TIFF Classes.)  So we must believe
that we know what we are doing in establishing these Classes.  If
we do not, any mistakes will be expensive.

Overview

Four TIFF Classes have been defined:

o    Class B for bilevel (1-bit) images
o    Class G for grayscale images
o    Class P for palette color images
o    Class R for RGB full color images

To save  time and  space, we will usually say "TIFF B", "TIFF G",
"TIFF P,"  and "TIFF R."  If we are talking about all four types,
we may write "TIFF X."

(Note to  fax people:   if  you are  interested in  a fax  TIFF F
Class, please  get together  and decide  what should  be in  TIFF
Class F  files.  Let us know if we can help in any way.  When you
have decided,  send us  your results,  so that we can include the
information here.)

Core Requirements

This section  describes requirements  that are common to all TIFF
Class X images.

General Requirements

The following  are required  characteristics of  all TIFF Class X
files.

Where there are options, TIFF X writers can do whichever one they
want, though  we will  often recommend  a particular  choice, but
TIFF X  readers must  be able  to handle all of them.  Please pay
close attention  to the  recommendations.  It is possible that at
some point  in the future, new and even-simpler TIFF classes will
be defined that include only recommended features.

You will  need to  read at  least the first three sections of the
main specification  in order  to fully  understand the  following
discussion.

Defaults.  TIFF X writers may, but are not required, to write out
a field that has a default value, if the default value is the one
desired.   TIFF X  readers must  be  prepared  to  handle  either
situation.

Other fields.   TIFF  X readers  must be  prepared  to  encounter
fields other  than the  required fields  in TIFF X files.  TIFF X
writers  are  allowed  to  write  fields  such  as  Make,  Model,
DateTime, and so on, and TIFF X readers can certainly make use of
such fields  if they  exist.   TIFF X  readers must not, however,
refuse to read the file if such optional fields do not exist.

"MM" and  "II" byte order.  TIFF X readers must be able to handle
both byte  orders.    TIFF  writers  can  do  whichever  is  most
convenient  or   efficient.     Images  are   crossing  the   IBM
PC/Macintosh boundary  (and others  as well)  with a surprisingly
high frequency.   We could force writers to all use the same byte
order, but  preliminary evidence  indicates that  this will cause
problems  when   we  start   seeing  greater-than-8-bit   images.
Reversing bytes  while scanning could well slow down the scanning
process enough  to cause  the scanning  mechanism to  stop, which
tends to create image quality problems.

Multiple subfiles.   TIFF X readers must be prepared for multiple
images (i.e.,  subfiles) per  TIFF file,  although they  are  not
required to do anything with any image after the first one.  TIFF
X writers  must be  sure to write a long word of 0 after the last
IFD (this is the standard way of signalling that this IFD was the
last one) as indicated in the TIFF structure discussion.

If a  TIFF X  writer writes multiple subfiles, the first one must
be the  full resolution  image.   Subsequent subimages,  such  as
reduced resolution  images and  transparency masks, may be in any
order in  the TIFF  file.   If a reader wants to make use of such
subimages, it  will have to scan the IFD's before deciding how to
proceed.

TIFF X  Editors.   Editors, applications  that modify TIFF files,
have a few additional requirements.

TIFF editors  must be  especially careful  about subfiles.   If a
TIFF editor  edits a full-resolution subfile, but does not update
an accompanying  reduced-resolution subfile,  a reader  that uses
the reduced-resolution  subfile for  screen display  will display
the wrong  thing.   So TIFF  editors must  either  create  a  new
reduced-resolution subfile  when  they  alter  a  full-resolution
subfile, or  else they  must simply delete any subfiles that they
aren't prepared to deal with.

A similar  situation arises with the fields themselves.  A TIFF X
editor need  only worry  about the  TIFF X  required fields.   In
particular, it  is unnecessary,  and probably  dangerous, for  an
editor to  copy fields  that it does not understand.  It may have
altered the  file in  a way that is incompatible with the unknown
fields.

Required Fields

NewSubfileType.  LONG.  Recommended but not required.

ImageWidth.   SHORT or  LONG.   (That is, both "SHORT" and "LONG"
TIFF data  types are  allowed, and  must be  handled properly  by
readers.   TIFF writers  can use either.)  TIFF X readers are not
required to  read arbitrarily  large files however.  Some readers
will give  up if the entire image cannot fit in available memory.
(In such cases the reader should inform the user of the nature of
the problem.)   Others  will  probably  not  be  able  to  handle
ImageWidth greater  than 65535.   Recommendation: use LONG, since
resolutions seem to keep going up.

ImageLength.  SHORT or LONG.  Recommendation: use  LONG.

RowsPerStrip.  SHORT or LONG.  Readers must be able to handle any
value between  1 and  2**32-1.   However, some readers may try to
read an  entire strip  into memory  at one  time, so  that if the
entire image is one strip, the application may run out of memory.
Recommendation 1:   Set  RowsPerStrip such  that the size of each
strip is  about 8K  bytes.   Do this  even for uncompressed data,
since it  is easy  for a  writer and  makes  things  simpler  for
readers.  (Note:  extremely wide, high-resolution images may have
rows larger  than 8K  bytes; in this case, RowsPerStrip should be
1,  and   the  strip  will  just  have  to  be  larger  than  8K.
Recommendation 2: use LONG.

StripOffsets.   SHORT or  LONG.  As explained in the main part of
the  specification,   the  number   of  StripOffsets  depends  on
RowsPerStrip and  ImageLength.  Recommendation:  always use LONG.
(LONG must, of course, be used if the file is more than 64K bytes
in length.)

StripByteCounts.   SHORT or  LONG.   Many existing TIFF images do
not contain StripByteCounts, because, in a strict sense, they are
unnecessary.   It is  possible to  write an efficient TIFF reader
that does  not need  to know  in advance  the  exact  size  of  a
compressed strip.   But  it does  make things  considerably  more
complicated, so  we will require StripByteCounts in TIFF X files.
Recommendation:   use SHORT,  since strips are not supposed to be
very large.

XResolution, YResolution.   RATIONAL.   Note  that the  X  and  Y
resolutions may  be unequal.   A  TIFF X  reader must  be able to
handle this  case.   TIFF X pixel-editors will typically not care
about the  resolution,  but  applications  such  as  page  layout
programs will.

ResolutionUnit.   SHORT.   TIFF X  readers must  be  prepared  to
handle all three values for ResolutionUnit.

TIFF Class B - Bilevel

Required (in addition to the above core requirements)

The following fields and values are required for TIFF B files, in
addition to  the fields  required for  all  TIFF  X  images  (see
above).

SamplesPerPixel =  1.   SHORT.   (Since this  is the default, the
field need  not be  present.   The same  thing  holds  for  other
required TIFF X fields that have defaults.)

BitsPerSample = 1.  SHORT.

Compression = 1, 2 (CCITT 1D), or 32773 (PackBits).  SHORT.  TIFF
B readers  must handle all three.  Recommendation:  use PackBits.
It  is  simple,  effective,  fast,  and  has  a  good  worst-case
behavior.    CCITT  1D  is  definitely  more  effective  in  some
situations, such as scanning a page of body text, but is tough to
implement and  test, fairly  slow,  and  has  a  poor  worst-case
behavior.   Besides, scanning a page of 12 point text is not very
useful for  publishing applications,  unless the  image  data  is
turned into  ASCII text  via OCR  software, which  is outside the
scope of TIFF.

PhotometricInterpretation = 0 or 1.  SHORT.
A Sample TIFF B Image

Offset         Value
(hex)     Name (mostly hex)

Header:
0000 Byte Order     4D4D
0002 Version   002A
0004 1st IFD pointer     00000014

IFD:
0014 Entry Count    000D
0016 NewSubfileType 00FE 0004 00000001  00000000
0022 ImageWidth     0100 0004 00000001  000007D0
002E ImageLength    0101 0004 00000001  00000BB8
003A Compression    0103 0003 00000001  8005 0000
0046 PhotometricInterpretation     0106 0003 00000001  0001 0000
0052 StripOffsets   0111 0004 000000BC  000000B6
005E RowsPerStrip   0116 0004 00000001  00000010
006A StripByteCounts     0117 0003 000000BC  000003A6
0076 XResolution    011A 0005 00000001  00000696
0082 YResolution    011B 0005 00000001  0000069E
008E Software  0131 0002 0000000E  000006A6
009A DateTime  0132 0002 00000014  000006B6
00A6 Next IFD pointer    00000000

Fields pointed to by the tags:
00B6 StripOffsets   Offset0, Offset1, ... Offset187
03A6 StripByteCounts     Count0, Count1, ... Count187
0696 XResolution    0000012C 00000001
069E YResolution    0000012C 00000001
06A6 Software  "PageMaker 3.0"
06B6 DateTime  "1988:02:18 13:59:59"

Image Data:
00000700  Compressed data for strip 10
xxxxxxxx  Compressed data for strip 179
xxxxxxxx  Compressed data for strip 53
xxxxxxxx  Compressed data for strip 160
.
.
.

End of example

Comments on the TIFF B example

1.   The IFD  in our example starts at position hex 14.  It could
have been  anywhere in  the file  as long as the position is even
and greater  than or equal to 8, since the TIFF header is 8 bytes
long and must be the first thing in a TIFF file.

2.   With 16 rows per strip, we have 188 strips in all.

3.   The example  uses a  number  of  optional  fields,  such  as
DateTime.   TIFF X  readers must safely skip over these fields if
they do not want to use the information.  And TIFF X readers must
not require that such fields be present.

4.   Just for  fun, our example has highly fragmented image data;
the strips  of our  image are  not even in sequential order.  The
point is  that strip  offsets must  not be ignored.  Never assume
that strip  N+1 follows  strip N.    Incidentally,  there  is  no
requirement that  the image  data follows  the  IFD  information.
Just the follow the pointers, whether they be IFD pointers, field
pointers, or Strip Offsets.

TIFF Class G - Grayscale

Required (in addition to the above core requirements)

SamplesPerPixel = 1.  SHORT.

BitsPerSample =  4,  8.    SHORT.    There  seems  to  be  little
justification for  working with grayscale images shallower than 4
bits, and 5-bit , 6-bit, and 7-bit images can easily be stored as
8-bit images, as long as you can compress the "unused" bit planes
without penalty.  And we can do just that with LZW (Compression =
5.)

Compression = 1 or 5 (LZW).  SHORT.  Recommendation: use 5, since
LZW decompression is turning out to be quite fast.

PhotometricInterpretation = 0 or 1.  SHORT.   Recommendation: use
1, due  to popular  user interfaces  for adjusting brightness and
contrast.

TIFF Class P - Palette Color

Required (in addition to the above core requirements)

SamplesPerPixel = 1.  SHORT.  We use each pixel value as an index
into all three color tables in ColorMap.

BitsPerSample =  1,2,3,4,5,6,7, or 8.  SHORT.  1,2,3,4, and 8 are
probably the  most common,  but as long as we are doing that, the
rest come pretty much for free.

Compression = 1 or 5.  SHORT.

PhotometricInterpretation = 3 (Palette Color).  SHORT.

ColorMap.  SHORT.

Note that  bilevel and  grayscale images  can be  represented  as
special cases  of palette  color images.  As soon as enough major
applications support  palette color  images, we may want to start
getting rid  of  distinctions  between  bilevel,  grayscale,  and
palette color images.

TIFF Class R - RGB Full Color

Required (in addition to the above Core Requirements)

SamplesPerPixel = 3.  SHORT.  One sample each for Red, Green, and
Blue.

BitsPerSample =  8,8,8.   SHORT.  Shallower samples can easily be
stored as 8-bit samples with no penalty if the data is compressed
with LZW.  And evidence to date indicates that images deeper than
8 bits  per sample are not worth the extra work, even in the most
demanding publishing applications.

PlanarConfiguration = 1 or 2.  SHORT.  Recommendation:  use 1.

Compression = 1 or 5.  SHORT.

PhotometricInterpretation = 2 (RGB).  SHORT.

Recommended

Recommended for  TIFF Class  R, but not required, are the new (as
of Revision 5.0) colorimetric information tags.  See Appendix H.

Conformance and User Interface

Applications that  write valid  TIFF X files should include "TIFF
B" and/or  "TIFF G"  and/or "TIFF  P" and/or  "TIFF R"  and/or in
their product  spec sheets, if they can write the respective TIFF
Class X  files.   If your  application writes  all four  of these
types, you  may wish to write it as "TIFF B,G,P,R."  Of course, a
term like  "TIFF B,"  while fine  for  communicating  with  other
vendors, will  not convey much information to a typical user.  In
this case,  a  phrase  such  as  "Standard  TIFF  Black-and-White
Scanned Images" might be better.

The same  terminology guidelines  apply to applications that read
TIFF Class X files.

If your  application reads more kinds of files than it writes, or
vice versa,  it would  be a  good idea  to make that clear to the
buyer.   For example, if your application reads TIFF B and TIFF G

files, but writes only TIFF G files, you should write it that way
in the spec sheet.

Appendix H: Image Colorimetry Information

Chris Sears
210 Lake Street
San Francisco, CA 94118

June 4, 1988
Revised August 8, 1988

I. Introduction

Our goal is to accurately reproduce a color image using different
devices.   Accuracy requires  techniques  of  measurement  and  a
standard  of   comparison.     Different  devices   imply  device
independence.   Colorimetry provides the framework to solve these
problems.  When an image has a complete colorimetric description,
in principle  it  can  be  reproduced  identically  on  different
monitors and using different media, such as offset lithography.

The colorimetry  data is  specified when  the image is created or
changed.   A scanned image has colorimetry data derived from  the
filters and  light sources  of the  scanner and a synthetic image
has colorimetry  data corresponding to the monitor used to create
it or  the monitor model of the rendering environment.  This data
is used  to map  an input  image to  the markings  or colors of a
particular output device.

Section II  describes various  standards organizations  and their
work in color.
Section III describes our motivation for seeking these tags.
Section IV describes our goals of reproduction.
Sections V, VI and VII introduce the colorimetry tags.
Section VIII specifies the tags themselves.
Section IX describes the defaults.
Section X discusses the limitations and some of the other issues.
Section XI provides a few references.

II. Related Standards

TIFF is  a general  standard for describing image data.  It would
be foolish  for us  to change  TIFF in  a way  that did not match
existing industry  and international  standards.   Therefore,  we
have taken  pains to  note in the discussion below the efforts of
various standards organizations and select defaults from the work
of these organizations.

CIE  (Commission Internationale de l'Eclairage)  The basis of the
colorimetry information  is the  CIE 1931  Standard Observer [3].
While other color models could be supported [1] [4], CIE 1931 XYZ
is the  international standard  accepted  across  industries  for
specifying  color   and  CIE  xyY  is  the  chromaticity  diagram
associated with CIE 1931 XYZ tristimulus values.

NTSC (National Television  System Committee)  NTSC is of interest
primarily  for   historical  reasons  and  its  use  in  encoding
television data.   Manufacturing  standards for monitors have for
some time  drifted significantly  from the  1953 NTSC colorimetry
specification.

SMPTE     (Society of  Motion Picture  and Television  Engineers)
Most of  the work  by NTSC  has been  largely subsumed  by SMPTE.
This organization  has a  set of  standards  called  "Recommended
Practices" that  apply to  various technical  aspects of film and
television production [5] [6].

ISO  (International  Standards  Organization)    ISO  has  become
involved in  color standards  through work on a color addendum to
"Office Document Architecture (ODA) and Interchange Format" [7].

ANSI (American  National   Standards  Institute)    ANSI  is  the
American representative to ISO .

III. Motivation

Our motivation  for defining  these tags  stems from our research
and  development  in  color  separation  technology.    With  the
information described here and the RGB pixel data, we have all of
the  information  necessary  for  generating  high-quality  color
separations.  We could supply the colorimetry information outside
of the  image  file.    But  since  TIFF  provides  a  convenient
mechanism for  bundling all  of the  relevant  information  in  a
single place,  tags are  defined to  describe this information in
color TIFF files.

A color  image rendered  with incorrect  colorimetry  information
looks different  from the original.  One of our early test images
has an artifact in it where the image was scanned with one set of
primaries and  color ramps  were  overlaid  on  top  of  it  with
different primaries.  The blue ramp looked purple when we printed
it. Using incorrect gamma tables or white points can also lead to
distorted images.  The best way to avoid these kinds of errors is
to allow  the creator  of an  image  to  supply  the  colorimetry
information along with the RGB values [1] [2].

The purpose  of the  colorimetry data  is to  allow a  projective
transformation from the primaries and white point of the image to
the primaries  and white  point of  the rendering  media.   Gamma
reflects the non-linear transfer gradient of real media.

IV. Colorimetric Color Reproduction

Earlier we  said that given the proper colorimetric data an image
could be  rendered identically  using  two  different  calibrated
devices.   By identical,  we mean  colorimetric reproduction [9].
Specifically, the  chromaticities  match  and  the  luminance  is
scaled to correspond to the luminance range of the output device.
Because of this, we only need the chromaticity coordinates of the
white point  and primaries.   The absolute luminance is arbitrary
and unnecessary.

V. White Point

In TIFF 4.0, the white point was specified as D65.  This appendix
allocates a  separate tag  for describing the white point and D65
is the logical default since it is the SMPTE standard [6].

The white  point is  defined  colorimetrically  in  the  CIE  xyY
chromaticity diagram.   While  it is  rare for monitors to differ
from D65,  scanned images  often  have  different  white  points.
Rendered images  can have  arbitrary white  points.   The graphic
arts use D50 as the standard viewing light source [8].

VI. Primary Chromaticities

In TIFF  4.0, the  primary color  chromaticities matched the NTSC
specification.  With the wide variety of color scanners, monitors
and renderers,  TIFF needs  a mechanism for accurately describing
the chromaticities  of the  primary colors.   We use SMPTE as the
default chromaticity  since conventional  monitors are  closer to
SMPTE and  some monitors  (Conrac 6545)  are manufactured  to the
SMPTE specifications.   We  don't use the NTSC chromaticities and
white point  because present day monitors don't use them and must
be "matrixed" to approximate them.

As an  example, the primary color chromaticities used by the Sony
Trinatron differ  from those  recommended by  SMPTE.  In general,
since  real  monitors  vary  from  the  industry  standards,  the
chromaticities of  primaries are described in the CIE xyY system.
This  allows   a  reproduction   system  to  compensate  for  the
differences.

VII. Color Response Curves

This tag  defines three  color response curves, one each for red,
green, and blue color information.  The width of each entry is 16
bits, as  implied by  the type  SHORT.   The minimum intensity is
represented by 0 and the maximum by 65535.  For example, black is
represented by  0,0,0 and  white by  65535, 65535,  65535.    The
length of  each curve is 2**BitsPerSample.  A ColorResponseCurves
field for RGB data where each of the samples is 8 bits deep would
have 3*256  entries.   The 256  red  entries  would  come  first,
followed by 256 green entries, followed by 256 blue entries.

The purpose  of the  ColorResponseCurves field  is to  act  as  a
lookup table  mapping sample values to specific intensity values,
so that  an image  created on  one system  can  be  displayed  on
another   with   minimal   loss   of   color   fidelity.      The
ColorResponseCurves field thus describes the "gamma" of an image,
so that  a TIFF  reader on another system can compensate for both
the image gamma and the gamma of the reading system.

Gamma is  a term that relates to the typically nonlinear response
of most  display devices,  including monitors.   In  most display
systems, the  voltage applied to the CRT is directly proportional
to the  value of  the red,  green, or  blue sample.  However, the
resulting luminance  emitted by  the  phosphor  is  not  directly
proportional to  the voltage.   This relationship is approximated
in most displays by

     luminance = voltage ** gamma

The NTSC  standard gamma  of 2.2 adequately describes most common
video systems.  The standard  gamma of  2.2 implies a dim viewing
surround.   (We know of no SMPTE recommended practice for gamma.)
The following example uses an 8 bit sample with value of 127.

     voltage = 127 / 255 = 0.4980
     luminance = 0.4980 ** 2.2 = 0.2157

In the  examples below,  we only  consider a  single primary  and
therefore only a single curve.  The same analysis applies to each
of the  red, green,  and blue  primaries and  curves.   Also, and
without loss  of generality,  we assume that there is no hardware
color map, so that we must alter the pixel values themselves.  If
there is  a color  map, the  manipulations can be done on the map
instead of on the pixels.

If no  ColorResponseCurves field  exists in  a color  image,  the
reader should  assume a  gamma of  2.2 for each of the primaries.
This default curve can be generated with the following C code:

     ValuesPerSample = 1 << BitsPerSample;
     for (curve[0] = 0, i = 1; i < ValuesPerSample; i++)
          curve[i] =  floor (pow  (i /  (ValuesPerSample -  1.0),
2.2) * 65535.0 + .5);

The displaying  or rendering  application can know its own gamma,
which we  will call  the "destination  gamma."   (An uncalibrated
system can usually assume that its gamma is 2.2 without going too
far  wrong.)     Using   this  information  the  application  can
compensate for the gamma of the image, as we shall see below.

If  the  source  and  destination  systems  are  both  adequately
described  by   a  gamma  of  2.2,  the  writer  would  omit  the
ColorResponseCurves field,  and the  reader can  simply read  the
image directly into the frame buffer.  If a writer writes out the
ColorResponseCurves field,  then a  reader must  assume that  the
gammas  differ.    A  reader  must  then  perform  the  following
computation on each sample in the image:

     NewSampleValue  =   floor  (pow   (curve[OldSampleValue]   /
65535.0, 1.0 / DestinationGamma) *
       (ValuesPerSample - 1.0) + .5);

Of course,  if the "gamma" of the destination system is not well-
approximated with  an exponential  function, an  arbitrary  table
lookup may  be used  in place  of raising  the  value  to  1.0  /
DestinationGamma.

Leave out  ColorResponseCurves if  using the default gamma.  This
saves about  1.5K in  the  most  common  case,  and,  after  all,
omission is the better part of compression.

Do not  use this  field to  store frame  buffer color  maps.  Use
instead   the    ColorMap   field.       Note,    however,   that
ColorResponseCurves may  be used  to refine  the information in a
ColorMap if desired.

The above  examples assume  that a  single parameter gamma system
adequately approximates the response characteristics of the image
source and  destination systems.   This will usually be true, but
our use  of a table instead of a single gamma parameter gives the
flexibility  to  describe  more  complex  relationships,  without
requiring additional computation or complexity.

VIII. New Tags and Changes
The following tags should be placed in the "Basic Fields" section
of
the TIFF specification:

White Point
Tag  = 318 (13E)
Type = RATIONAL
N    = 2

The white  point of the image.  Note that this value is described
using  the  1931  CIE  xyY  chromaticity  diagram  and  only  the
chromaticity is  specified.  The luminance component is arbitrary
and not  specified.   This can correspond to the white point of a
monitor that  the image  was painted  on,  the  filter  set/light
source combination  of a  scanner, or  to the  white point of the
illumination model of a rendering package.

Default is the SMPTE white point, D65:  x = 0.313, y = 0.329.

The ordering is x, y.

PrimaryChromaticities
Tag  = 319 (13F)
Type = RATIONAL
N    = 6

The primary  color chromaticities.   Note  that these  values are
described using  the 1931  CIE xyY  chromaticity diagram and only
the chromaticities  are  specified.    For  paint  images,  these
represent the  chromaticities of  the  monitor  and  for  scanned
images  they   are  derived  from  the  filter  set/light  source
combination of a scanner.

Default is the SMPTE primary color chromaticities:

     Red: x = 0.635 y = 0.340
     Green:    x = 0.305 y = 0.595
     Blue:     x = 0.155 y = 0.070

The ordering is red x, red y, green x, green y, blue x, blue y.

Color Response Curves

Default for  ColorResponseCurves represents  curves corresponding
to the NTSC standard gamma of 2.2.

IX. Defaults

The defaults  used by  TIFF reflect industry standards.  Both the
WhitePoint and  PrimaryChromaticities tags have defaults that are
promoted  by   SMPTE  .     In  addition,  the  default  for  the
ColorResponseCurves tag matches the NTSC specification of a gamma
of 2.2.

The purpose  of these  defaults is to allow reasonable results in
the absence  of  accurate  colorimetry  data.    An  uncalibrated
scanner or  paint system  produces an  image  that  be  displayed
identically, though  probably incorrectly  on two  different  but
calibrated systems.   This is better then the uncertain situation
where the  image might  be rendered  differently on two different
but calibrated systems.

X. Limitations and Issues

This section  discusses several  of the  limitations  and  issues
involved in colorimetric reproduction.

Scope of Usefulness

For many  purposes the  data recommended  here is unnecessary and
can be omitted.  For presentation graphics where there are only a
few colors,  being able  to tell  red from green is probably good
enough.   In this  case the  tags can  be ignored and there is no
overhead.   In more  demanding color  reproduction  environments,
this data  allows images to be described device independently and
at small cost.

User Burdens

The data we recommend isn't a user burden; it is really a systems
issue.   It allows  a systems  solution but  doesn't require user
intercession.   Calibration however  is a  separate issue.  It is
likely to involve the user.

Resolution Versus Fidelity

Some manufacturers  supply greater than 24 bits of resolution for
color specification.   The  purpose of  this is  either to  avoid
artifacts such  as contouring  in the shadows or in some cases to
be more  specific or  device independent  about the  color.  Both
reasons can  be misguided.   Other, less expensive techniques can
be used  to prevent artifacts, such as deeper color maps.  As for
accuracy, fidelity is more important than precision.

Colorimetric Color Reproduction

There are other choices for objectives of color reproduction [9].
Spectral color  reproduction is a stronger condition and most are
weaker, such  as preferred  color  reproduction.    While  device
independent spectral  color reproduction  is  impossible,  device
independent  colorimetric  reproduction  is  possible,  within  a
tolerance and within the limits of the gamuts of the devices.  By
choosing a  strong criteria  we allow the important objectives of
weaker criteria, such as preferred color reproduction, to be part
of design packages.

Metamerism

If two  patches of  color  are  identical  under  one  light  and
different under  another, they  are said  to be  metameric pairs.
Colorimetric  color  reproduction  is  a  weaker  condition  than
spectral color reproduction and hence allows metamerism problems.
By standardizing  the viewing  conditions we  can largely finesse
the metamerism  problem for  print.   Because television is self-
luminous and doesn't use spectral absorption, metamerism isn't so
much a problem.

Color Models - xyY Versus Luv, etc.

We choose  xyY over  Luv [1]  because XYZ  is  the  international
standard for  color specification  and xyY  is  the  chromaticity
diagram associated  with XYZ.   Luv is meant for color difference
measurement.

Ambient Environment And Viewing Surrounds

The viewing environment affects how the eye perceives color.  The
eye adapts  to a  dark room  and it adapts to a colored surround.
While  these   problems  can   be  compensated   for  within  the
colorimetric framework  [4], it is much better to finesse them by
standardizing.   The design environment should match the intended
viewing environment.   Specifically it should not be a pitch dark
room and,  on average,  it should  be of  a neutral  color.   For
print, ANSI recommends a Munsell N-8 surface [8].

XI. References

In particular,  we would  like to mention the work of Stuart Ring
of the  Copy Products  Division of the Eastman Kodak Company.  He
and  his  colleagues  are  promoting  a  color  data  interchange
paradigm.   They are  working closely  with the  ISO 8613 Working
Group [7].

[1]  Color Data  Interchange Paradigm,  Eastman Kodak, Rochester,
New York, 7 December 1987.

[2]  Color  Reproduction   and  Illumination  Models,  Roy  Hall,
International Summer  Institute:   State of  the Art  in Computer
Graphics, 1986.

[3]  CIE   Colorimetry:    Official   Recommendations    of   the
International Commission on Illumination, Publication 15-2, 1986.

[4]  Color Science:  Concepts and  Methods, Quantitative Data and
Formulae, Gunter  Wyszecki, W.S.  Stiles, John  Wiley  and  Sons,
Inc., New York, New York, 1982.

[5]  Color Monitor  Colorimetry, SMPTE  Recommended  Practice  RP
145-1987.

[6]  Color Temperature  for  Color  Television  Studio  Monitors,
SMPTE Recommended Practice RP 37.

[7]  Office   Document   Architecture   (ODA)   and   Interchange
Format - Addendum on Colour, ISO 8613 Working Draft.

[8]  ANSI Standard PH 2.30-1985.

[9]  The Reproduction  of Colour  in  Photography,  Printing  and
Television, R.  W. G.  Hunt, Fountain  Press, Tolworth,  England,
1987.

[10] Raster  Graphics   Handbook,  The  Conrac  Corporation,  Van
Nostrand Reinhold  Company, New  York,  New  York,  1985.    Good
description of gamma.

Appendix I:  Horizontal Differencing Predictor

This appendix,  written after  the release of Revision 5.0 of the
TIFF specification,  is still  in draft  form.   Please send  any
comments to the Aldus Developers Desk.

Revision 5.0  of the  TIFF specification defined a new tag called
"Predictor"  that  describes  techniques  that  may  be  used  in
conjuction with  TIFF compression  schemes.    We  now  define  a
Predictor that  greatly  improves  compression  ratios  for  some
images.

The horizontal  differencing predictor  is assigned the tag value
Predictor = 2:

Predictor
Tag  = 317 (13D)
Type = SHORT
N    = 1

A predictor  a mathematical operator that is applied to the image
data before  the encoding  scheme is  applied.   Currently (as of
revision 5.0)  this tag  is used  only with  LZW  (Compression=5)
encoding, since  LZW is  probably the  only TIFF  encoding scheme
that benefits  significantly from a predictor step.  See Appendix
F.

1 = No prediction scheme used before coding.
2 = Horizontal differencing. See Appendix I.

Default is 1.

The algorithm

The idea  is to  make use  of the  fact that many continuous tone
images rarely  vary much  in pixel  value from  one pixel  to the
next.   In such  images,  if  we  replace  the  pixel  values  by
differences between  consecutive pixels,  many of the differences
should be  0, plus  or minus  1, and  so on.   This  reduces  the
apparent information  content, and  thus allows LZW to encode the
data more compactly.

Assuming 8-bit  grayscale  pixels  for  the  moment,  a  basic  C
implementation might look something like this:

     char image[ ][ ];
     int  row, col;

     /* take horizontal differences:
      */
     for (row = 0; row < nrows; row++)
          for (col = ncols - 1; col >= 1; col--)
               image[row][col] -= image[row][col-1];

If we  don't have 8-bit samples, we need to work a little harder,
so that  we can make better use of the architecture of most CPUs.
Suppose we  have 4-bit  samples, packed  two to a byte, in normal
TIFF uncompressed  (i.e., Compression=1)  fashion.   In order  to
find differences,  we want to first expand each 4-bit sample into
an 8-bit  byte, so  that we  have one  sample per byte, low-order
justified.   We then  perform the  above horizontal differencing.
Once the  differencing has  been completed, we then repack the 4-
bit differences  two to  a  byte,  in  normal  TIFF  uncompressed
fashion.

If the  samples are  greater than  8  bits  deep,  expanding  the
samples into  16-bit words  instead of 8-bit bytes seems like the
best way to perform the subtraction on most computers.

Note that we have not lost any data up to this point, nor will we
lose any  data later  on.   It  might  at  first  seem  that  our
differencing might  turn 8-bit samples into 9-bit differences, 4-
bit samples  into 5-bit differences, and so on.  But it turns out
that we  can completely  ignore the  "overflow"  bits  caused  by
subtracting a  larger number  from a  smaller  number  and  still
reverse the  process  without  error.    Normal  twos  complement
arithmetic does just what we want.  Try an example by hand if you
need more convincing.

Up  to  this  point  we  have  implicitly  assumed  that  we  are
compressing  bilevel   or  grayscale   images.     An  additional
consideration arises in the case of color images.

If PlanarConfiguration  is 2,  there is no problem.  Differencing
proceeds the same way as it would for grayscale data.

If  PlanarConfiguration  is  1,  however,  things  get  a  little
trickier.   If  we  didn't  do  anything  special,  we  would  be
subtracting red  sample values  from green  sample values,  green
sample values  from blue  sample values,  and blue  sample values
from red sample values, which would not give the LZW coding stage
much redundancy  to work  with.   So we  will do  our  horizontal
differences with  an offset  of SamplesPerPixel  (3, in  the  RGB
case).  In other words, we will subtract red from red, green from
green, and  blue from blue.  The LZW coding stage is identical to
the SamplesPerPixel=1 case.  We require that BitsPerSample be the
same for all 3 samples.

Results and guidelines

LZW without  differencing works  well  for  1-bit  images,  4-bit
grayscale images, and synthetic color images.  But natural 24-bit
color images  and some 8-bit grayscale images do much better with
differencing.  For example, our 24-bit natural test images hardly
compressed at  all using  "plain" LZW:  the  average  compression
ratio was  1.04  to  1.    The  average  compression  ratio  with
horizontal differencing  was 1.40  to 1.  (A compression ratio of
1.40 to 1 means that if the uncompressed image is 1.40MB in size,
the compressed version is 1MB in size.)

Although  the   combination  of   LZW  coding   with   horizontal
differencing does  not result  in any  loss of  data, it  may  be
worthwhile in  some situations  to give  up some  information  by
removing as  much noise  as possible  from the  image data before
doing the  differencing, especially  with  8-bit  samples.    The
simplest way  to get  rid of noise is to mask off one or two low-
order bits  of each 8-bit sample.  On our 24-bit test images, LZW
with horizontal differencing yielded an average compression ratio
of 1.4 to 1.  When the low-order bit was masked from each sample,
the compression  ratio climbed to 1.8 to 1; the compression ratio
was 2.4  to 1  when masking  two bits,  and 3.4 to 1 when masking
three bits.   Of  course, the  more you  mask, the  more you risk
losing useful information along with the noise.  We encourage you
to experiment  to find  the best compromise for your device.  For
some applications it may be useful to let the user make the final
decision.

Interestingly, most  of our RGB images compressed slightly better
using PlanarConfiguration=1.   One  might think  that compressing
the  red,   green,  and   blue   difference   planes   separately
(PlanarConfiguration=2) might  give  better  compression  results
than  mixing   the  differences   together   before   compressing
(PlanarConfiguration=1), but this does not appear to be the case.

Incidentally,  we  tried  taking  both  horizontal  and  vertical
differences,  but   the  extra   complexity  of   two-dimensional
differencing did  not appear  to pay  off for  most of  our  test
images.  About one third of the images compressed slightly better
with two-dimensional  differencing, about  one  third  compressed
slightly worse, and the rest were about the same.

Appendix J:  Palette Color

This appendix,  written after  the release of Revision 5.0 of the
TIFF specification,  is still  in draft  form.   Please send  any
comments to the Aldus Developers Desk.

Revision  5.0   of  the   TIFF  specification   defined   a   new
PhotometricInterpretation value  called "palette color."  We have
been wondering  lately if this additional complexity is worth the
implementation expense.   If  not, let's  drop it before too many
people start creating palette color images.

The Proposal

Instead of a separate palette color image type, there seems to be
no compelling reason why palette (mapped) color images should not
be stored as "full color" (usually 24-bit) RGB images.

Objections

The most  obvious objection is the amount of space required.  But
if you  care about how much space the image takes up on disk, you
should use  LZW compression,  which is  ideally  suited  to  most
palette color  images.   (LZW compresses  most paint-type palette
color images  5:1 or  more.)   And if you use LZW compression, it
turns out  that palette  color images stored as full color images
compress to  almost exactly the same size as palette color images
stored as  palette color  images.  That is, with LZW compression,
there is  no penalty  for storing  palette color  images as  full
color RGB  images.   The resulting  file may  be  a  few  percent
larger, but  it will not be three times as large.  See Appendix F
for more information on how LZW works.

Another objection  might be  that an  application might  want  to
process the  image differently  if it is "really" a palette color
image. But  we can easily add auxiliary information that can help
a TIFF  reader to quickly categorize color images if it wants to.
See the "New tags" section below.

Benefits

It may  be obvious,  but it  is probably  worth discussing why we
want  to   abolish   palette   color   images   as   a   distinct
classification.

The main  problem is  that palette color as a separate type makes
life more  hazardous and  confusing for  users.    The  confusion
factor is  aggravated because  users already  have to be somewhat
aware of  distinctions  between  bilevel,  grayscale,  and  color
images.   Having two  main types of color images is hard for many
users to  grasp, and  it is probably not possible to totally hide
this complexity  from the user in certain situations.  The hazard
level goes  up because some applications may accept palette color
but not  full color  images, or  full color but not palette color
images, or may accept 8-bit palette color images but not 4-bit or
3-bit palette color images.

The second  problem is  that writing and maintaining code to deal
with an  additional image  type is  somewhat expensive  for  TIFF
readers.  The cost of supporting palette color images will depend
on the  application, but  we believe  that, in  general, the cost
will be  substantial.   It seems  to make  more sense  to put the
burden on  TIFF writers to convert palette color images into full
color image  form than  to make TIFF readers handle an additional
color image  type, since there are more TIFF readers than writers
at this point.

New tags

Here are  some proposed  new tags that can help to classify color
images, and  make up  for not  having a  separate  palette  color
class.  They are not required for TIFF Class R , but are strongly
recommended for  color TIFF  images created by palette-type color
paint programs.

ColorImageType
Tag  = 318 (13E)
Type = SHORT
N    = 1

Gives TIFF  color image  readers a  better idea  of what  kind of
color image it is.  There will be borderline cases.

1 = Continuous tone, natural image.
2 =  Synthetic image, using a greatly restricted range of colors.
Such images  are produced  by most  color paint  programs.    See
ColorList for a list of colors used in this image.

Default is 1.

ColorList
Tag  = 319 (13F)
Type = BYTE or SHORT
N    = the  number of  colors that  are used in this image, times
SamplesPerPixel

A list  of colors that are used in this image.  Use of this field
is only  practical for  images containing  a  greatly  restricted
(usually  less   than  or   equal  to   256)  range   of  colors.
ColorImageType should be 2.  See ColorImageType.

The list  is organized  as an array of RGB triplets, with no pad.
The RGB  triplets are  not guaranteed  to be  in  any  particular
order.   Note that the red, green, and blue components can either
be a  BYTE or  a SHORT  in length.  BYTE should be sufficient for
most applications.

No default.
[Search all CoOL documents]