Tuesday, May 26, 2009

Yet another PreviewImage strategy

One big (huge, actually) problem with JPEG images is that they are limited to segment sizes of less than 64 kB. Since the EXIF information must go in one segment, this forces camera manufacturers to invent their own ways of storing larger data blocks. This is a real problem for the preview image, which many manufacturers write in their JPEG's, and which can easily push the EXIF data size to over 64 kB.

The problem is that every manufacturer is using a different technique to store the image.

Canon, Olympus, Konica/Minolta and Sony cameras write the PreviewImage after the JPEG EOI. This technique allows a contiguous preview to be stored, but trailers like this are typically lost if the image is edited. So this solution is not ideal.

Nikon, Pentax and Casio keep the PreviewImage small enough to fit inside the EXIF APP1 segment (less than about 30 kb), which makes the images too small so they are only useful as a thumbnail.

Kodak writes the image to a special APP2 FPXR segment, which is actually part of the EXIF specification, but the format is a Microsoft-devised abomination that nobody in their right mind would ever think of using. Oh, except FujiFilm, who write the image in this segment, but don't bother to write all the necessary table of contents to be able to read it using the standard technique.

I have just discovered that the new Samsung cameras recently started embedding preview images larger than 64 kB, and of course they created a new technique to do so. If they were smart, they would have developed a simple technique that could be used by others in the future, but of course they were stupid, and didn't think that far ahead. (Such is the normal path of dumb camera manufacturers when it comes to metadata.)

The new Samsung models simply split the preview and write it to separate APP2 segments with no header. If they had written a header (like "PREVIEW\0" for example), then the technique could be portable and useful. But they didn't. Without a header, the data can not easily be distinguised from other random APP2 data, so this technique is not generally useful.

Why not just use APP2 with a simple "PREVIEW\0" header? If everyone did this, life would be much simpler.