PBF (Portable Bitmap Format) Specification, Third Draft

By Thomas Boutell, boutell@netcom.com, 1/15/1995
  Permission granted to reproduce this specification in complete
  and unaltered form. Excerpts may be printed with the
  following notice: "excerpted from the PBF
  (Portable Bitmap Format) specification by
  Thomas Boutell." No notice is required in software
  that follows this specification; notice is only required
  when reproducing or excerpting from the specification itself.

The author wishes to acknowledge the contributions of the New
Graphics Format mailing list and the readers of comp.graphics.
(Mr. Boutell is solely responsible for errors of fact or design
in the PBF specification, however.)

This is the third draft of the PBF specification discussion
document, replacing the second draft. There are many
significant changes from the previous drafts.

This draft is intended solely to generate comments and
does not represent the final standard.

1. Rationale

The PBF format is intended to provide a portable,
legally unemcumbered, simple, lossless, streaming-capable,
well-specified standard for bitmapped image files which gives
new features to the end user at minimal cost to the developer.

It has been asked why the PBF format is not simply an
extension of the GIF format. The short answer is that the GIF
format is embroiled in legal disputes, does not support
24-bit images and lacks an alpha channel mechanism.

It has been asked why the PBF format is not TIFF, or
a subset of TIFF. The answer is that TIFF does not support
a compression scheme that is not legally encumbered,
and that a subset of TIFF would simply frustrate
users making the reasonable assumption that a file
saved as TIFF from Software XYZ will load into a
program supporting our flavor of TIFF. Implementing
full TIFF would violate the simplicity constraint.

It has been asked why the PBF format is not IFF,
or a sub- or superset of IFF. The same concern applies
as with TIFF: users with software that purports to
generate IFF files will not be pleased when those
files do not load in programs supporting the new
specification. In addition, the IFF specification
has rarely been accurately implemented and there
is considerable disagreement among implementations.

It has been asked why PBF does not include
lossy compression. The answer is that JPEG already does
an excellent job of lossy compression, and there is
no reason to repeat that effort. Different tools,
different jobs.

2. Design Differences from Other Formats

PBF has been expressly designed not to be completely
dependent on a single compression technique. Although
inflate/deflate compression is mentioned in this
document, PBF would still exist without it.

PBF supports an alpha channel instead of the
transparency-index approach used in GIF. An alpha
channel is much more flexible than a transparency
index, but can be just as simple in palette-color
images; conversion from one format to the other
will not be difficult to accomplish without loss
of transparency.

3. Data Representation Note

Byte Order

All integers which are not 1 byte integers will be in
network byte order, which is to say the most significant
byte comes first, and the less significant bytes in
descending order of significance (simply MSB LSB
for two-byte integers, B3 B2 B1 B0 for 4-byte
integers). References to bit 7 refer to the
highest bit (128) of a byte; references to
bit 0 refer to the lowest bit (1) of a byte.

Color Values and Gamma Correction

All color values range from zero (black) to
most intense at the maximum value. Color values
are based on a flat gamma response of 1.0, and
display hardware with other gamma values should
compensate accordingly. A display with a gamma
response of 2.0 will render midlevel grays
too darkly if this is not compensated for.
(This is not at all uncommon.)

Thus, if your display hardware has a gamma
value of 2.0, color values should be converted
to values between 0 and 1 and raised to
the (1/2.0) power for use on the actual display.

4. The Format

The Identification Header

The first four bytes always contain the following
ASCII characters:

.PBF

(The dot is included to avoid confusion with files
such as this one which discuss PBF as opposed to
being PBF files themselves.)

The Main Section

The remainder of the file consists of a series of
chunks, where each chunk consists of a 4-byte
chunk type consisting of UPPERCASE
ASCII letters and spaces (ascii 32), a 4-byte, UNSIGNED
length (not including itself or the chunk type), and the
data bytes appropriate to that chunk, if any. Note that this
provides for a chunk to be skipped even if the implementation
does not recognize that particular chunk type. The last
chunk should always be an EOF chunk.

Note also that the same chunk can appear more than
once if necessary, if so specified in the description
of the chunk. This is sometimes necessary in order
to implement streaming encoders.

The chunk-ordering mechanism present in the first two
drafts has been dropped. Instead, rules regarding chunk
order are stated in the description of each chunk.

Ancillary and Critical Chunks

Chunks which are not strictly necessary in order to
meaningfully display the contents of the file are known as
"ancillary" chunks, and their names must begin with
a capital "A" character.

Chunks which are critical to the successful display of the
file's contents begin with any other uppercase letter.

Critical chunks are necessary in order to properly
display the contents of the file. If an implementation
encounters a critical chunk type it does not know
how to handle, it must indicate this to the user and
not display the contents of the file. The image header chunk
(HEAD) is an example of a critical chunk.

A hypothetical vector-graphics chunk would also be a necessary
chunk, since without rendering it the image would appear
to be blank, or would contain a background bitmap
with no other information.

Ancillary chunks are ancillary information that enhances
the image in some fashion, but without which the image
can still be successfully displayed.  Examples are the
comment and copyright chunks.

Proprietary Chunks

If you want others outside your organization to understand
a chunk type that you invent, CONTACT THE AUTHOR
OF THE PBF SPECIFICATION (boutell@netcom.com) and
specify the format of the chunk's data and your
preferred chunk type. The author will assign a permanent,
unique chunk type. The chunk type will be publicly listed
in an appendix of extended chunk types which can be
optionally implemented. In the event that Mr. Boutell
is unable to maintain the specification, the task will
be passed on to a qualified volunteer.

If you do not require or desire that others outside your
organization understand the chunk type, you may
use a chunk name beginning with Q (for critical
chunks) or with AQ (for ancillary chunks).
Chunk types with these prefixes
will never be assigned in the public specification.
Please note that if you want to use these chunks for
information that is not essential to view the image,
and have any desire whatsoever that others not using your
internal viewer software be able to view the image,
you should use AQ rather than Q. Also note that
others may use the same proprietary prefixes,
so it would be advantageous to keep additional
identifying information at the beginning of
the chunk.

Standard Chunks

All PBF implementations must accept the following
chunk types in order to be considered
PBF-compliant. All implementations must understand
and successfully render the critical chunks below.
Standalone image viewers should also be capable of
displaying the ancillary chunks below, such as the copyright
notice, but this is not necessary for applications in which
many images may be displayed at once (ie,
WWW browsers).

Chunk Type    Description

HEAD          Bitmapped image header

              Width:            4 bytes
              Height:           4 bytes
              Bit depth:        1 byte
              Color type:       1 byte
              Compression type: 1 byte
              Interlace type:   1 byte

              Width and height are 4-byte integers.

              Bit depth is a single-byte integer. Valid values
              that software must support are 1, 2, 4, and 8.
              A value of 16 is also valid, but support for
              this is optional. Software that does not support
              a bit depth of 16 should acknowledge this if
              possible rather than indicating that the
              image is at fault. Bit depths of 16 should
              in any case never appear with color type 1.

              Color type is a single-byte integer. Valid values
              are 1, 2, 3 and 4. Color type determines the
              interpretation of the image data.

              Color Type  Valid Bit Depths  Interpretation
              1           1,2,4,8           Each pixel value is a palette
                                            index; a palette chunk will appear

              2           1,2,4,8,16        Each pixel value is a grayscale
                                            level, where the largest value is
                                            white, and zero is black

              3           8,16              Each pixel value is a three-value
                                            series: red (0 = black, max = red),
                                            green (0 = black, max = green),
                                            blue (0 = black, max = blue)

              4           8,16              Each pixel value is a four-value
                                            series: red (0 = black, max = red),
                                            green (0 = black, max = green),
                                            blue (0 = black, max = blue),
                                            alpha (0 = transparent,
                                            max = opaque)

              Compression type indicates the compression scheme
              which will be used to compress the image data.

              This draft proposes use of the inflate/deflate compression
              scheme, an LZ77 derivative which is used in zip, gzip, pkzip
              and related programs, because extensive research has been done
              supporting its legality. Inflate and deflate code
              is available in the zip/unzip packages with a very
              permissive license (yes, permissive enough for
              commercial purposes, see those packages for details).

              At present, only compression type 0 (inflate/deflate
              compression) is defined. At present, all standard PBF
              images will be compressed using this scheme.

              Interlace Type

              At present, there are two legal values for
              interlace type: 0 (no interlace) or 1
              (line-wise interlace).

              With interlace type 0, rows are laid out
              continuously from top to bottom.

              With interlace type 1, rows are stored in the
              following order:

              Every eighth row, starting from row 0
              Every eighth row, starting from row 4
              Every fourth row, starting from row 2
              Every second row, starting from row 1

              The purpose of this feature is to allow images
              to "fade in" in a simple fashion that does
              minimal damage to compression efficiency,
              although the file size is slightly expanded
              on average.

              Other interlace types have been proposed, and will
              replace this scheme in the final proposal if the gain
              in visual quality is sufficient to outweigh any compression
              penalties.

PLTE          Palette

              This chunk must appear for color type 1, and
              may appear for color types 3 and 4. In the latter
              two cases, the palette chunk is optional, and
              provides a recommended set of from 2 to 256 colors to
              which the true-color image should be quantized if the
              display hardware cannot display truecolor
              directly.

              The number of palette entires varies from 2 to 256.
              For chunk type 1, the number of entries should not
              exceed the range that can be represented by the
              bit depth (for example, 2~4 = 16 for a bit depth of 4).
              Note that this does NOT mean that there have to
              be a full 16 entries. The length of the chunk is used
              to determine the number of entires.

              For color type 1, each palette entry consists of a
              four-byte series:

                     red (0 = black, 255 = red),
                     green (0 = black, 255 = green),
                     blue (0 = black, 255 = blue),
                     alpha (0 = transparent, 255 = opaque)

              Image creation programs are strongly encouraged
              to place colors which the artist or algorithm
              regards as important first in the palette, when
              such information is available, in order
              to allow display hardware with a limited supply of
              colors to make intelligent compromises.

              For color types 3 and 4, in which the palette is
              optional and only a suggested quantization,
              the fourth byte (alpha) is NOT present; each
              palette entry consists of a three-byte series:

                     red (0 = black, 255 = red),
                     green (0 = black, 255 = green),
                     blue (0 = black, 255 = blue)

              (Note that the palette tag uses single-byte values
              for each channel even if the palette is a
              suggested quantization of a 16-bit image.)

ACPY          Copyright notice. The notice will consist of
              ISO LATIN-1 text and will not be null-terminated.
              New lines should be denoted by a single
              line feed (10 decimal).

ACMT          Comment. The comment will consist of
              ISO LATIN-1 text and will not be null-terminated.
              New lines should be denoted by a single
              line feed (10 decimal).

IDAT          Image data.

              The image data will be compressed using the
              compression scheme indicated by the COMP chunk.

              IMPORTANT: multiple image chunks can appear in
              sequence for the SAME image. Viewers must be able
              to interpret such chunks. (Simply speaking, the
              viewer knows it is not finished until it has read
              as many pixels as are indicated by the
              image dimensions in the HEAD chunk.) This rule
              exists to permit encoders to work in a fixed
              amount of memory by outputting multiple chunks.

              The following text describes the uncompressed
              data stream which will be fed to the compressor
              or received from the decompressor.

              Pixels are always laid out left to right in
              each row, and rows are arranged from
              top to bottom, except as modified by
              the interlace (ILCE) tag.

              Color types 1 and 2

              For color type 1, each pixel value is an index into the
              palette indicating which color in the palette should be
              displayed at that location. For color type 2 (grayscale),
              each pixel value is a grayscale level, where the maximum
              value representable by the bit depth is white.

              For 1-bit images, each horizontal line of pixels is represented
              by a stream of bits, in which bit 7 (128) is the
              leftmost pixel in the byte and bit 0 (1) is the
              rightmost. Consecutive lines may share bits if the
              pixels in the line do not fit evenly into bytes.
              That is, if the last pixel of the line falls
              in bit 4 of a byte, the first pixel of the next
              line is stored in bit 3 of the same byte.

              For 2-bit images, the same scheme is followed, except that
              each pixel is represented by a 2-bit portion
              of a byte, with the leftmost bit being most
              significant. For instance, the first pixel
              of the line is represented by bits 7 (128) and
              6 (64) of the byte. Consecutive lines may share bytes.

              For 4-bit images, the same scheme is followed, except
              that each pixel is represented by a 4-bit portion
              of a byte, with the leftmost bit being most
              significant. For instance, the first pixel
              of the line is represented by bits 7 (128),
              6 (64), 5 (32) and 4 (16) of the byte.
              Consecutive lines may share bytes.

              For 8-bit images, each pixel is represented by a single
              byte. For 16-bit grayscale images (color type 1),
              each pixel is represented by a two-byte unsigned integer.

              IMPORTANT:

              For 8- and 16-bit grayscale images (color type 2, bit depth
              of 8 or 16), the values are next input to the CROSS filter
              (for non-interlaced images; see below) or to the SUB filter
              (for interlaced images; see below) in order to improve
              compression before being input to the compressor itself.
              This step is NOT employed for palette color images
              (color type 1).

              Color types 3 and 4

              For color type 3, each pixel is represented by
              a red value, a green value, and a blue value,
              8 or 16 bits apiece respectively depending
              on the bit depth (8 or 16). For color type 4,
              an additional alpha (opacity) value of the
              same depth is added for each pixel.

              IMPORTANT:

              The values are next input to the CROSS filter
              (for non-interlaced images; see below) or to the SUB filter
              (for interlaced images; see below) in order to improve
              compression before being input to
              the compressor itself.

EOF           End of File

              The EOF chunk appears at the end of the PBF file.
              The chunk contains a four-byte checksum, calculated
              by adding together ALL preceding bytes in the file, not
              including the checksum itself. Bytes are added
              modulo 2~32 as unsigned integers to the 4-byte unsigned
              integer checksum (this is the natural outcome when
              unsigned bytes are added to a four-byte integer
              without regard to overflow). If the EOF checksum
              does not match the sum of the preceding bytes in
              the file, viewers may elect to attempt to display
              the contents of the file, but must warn the user
              that the checksum is incorrect.

Details of Specific Algorithms

Inflate and Deflate

See the zip/unzip package, which includes source code for
both purposes in the files inflate.c and deflate.c, with a
very permissive license. Documentation of the compression
scheme is also available; see the zip/unzip package for
references. (zip/unzip and pkzip are compatible but
not identical. pkzip is commercial software.)

The Cross Filter

The cross filter is used to improve compression on non-interlaced
truecolor images (color types 3 and 4) and 8- and 16-bit
grayscale images (color type 2).

Output the following value, using unsigned
modulo arithmetic and integers of the size
appropriate to the bit depth (8 or 16):

Pixel[x][y] - Pixel[x-1][y] - Pixel[x][y-1] + Pixel[x-1][y-1]

for each channel (red, green, blue, and sometimes alpha) of each pixel.
At the beginning of the image, the previous pixel and previous row
are considered to have had a value of zero for each channel.

To reverse the effect of the cross filter after decompression,
output the following value:

CrossedValue + Pixel[x-1][y] + Pixel[x][y-1] - Pixel[x-1][y-1]

storing the result as the value of the previous pixel for
use in uncrossing subsequent pixels.

The Sub Filter

The sub filter is used to improve compression on interlaced
truecolor images (color types 3 and 4) and 8- and 16-bit
grayscale images (color type 2).

For each pixel, output the difference between that pixel
and the previous pixel, modulo the range possible in
that bit depth. For instance, for a bit depth of 8,
if the previous pixel was 16 and the current pixel
is 64, store 48. If the previous pixel was 255 and
the current pixel is 20, store 25. Note that unsigned
addition is used. IMPORTANT: At the start of each line,
consider the previous pixel value to be zero.

The Alpha Channel

Standalone image viewers can ignore the alpha channel,
provided that they properly skip over it in order to
be in the right position to read the next pixel.

World Wide Web browsers and the like should regard any pixel
with an alpha channel value of zero as transparent (the pixel
should be given the background color of the browser), and
any pixel with the maximum alpha channel value for that
bit depth as opaque (not blending with the background at all).
Intermediate values should blend according to the percentage
of maximum specified.

-T
--
The ouzo of human kindness.

<URL:http://sunsite.unc.edu/boutell/index.html>
[Search all CoOL documents]