Hemera Photo-Object

Many years ago I dabbled in a little Graphic Design. Working for a commercial printer in the Pre-Press area, I was very familiar with all things graphics, but never had a great talent for design, especially drawing. I often needed the random clip art for a design I was working on, so I purchased the Hemera, The Big Box of Art, probably from my local CompUSA if that dates me.

Hemera Big Box of Art

The cool thing about clip art from Hemera is it was not your usual JPG or TIFF format, it was in a special Photo-Object format. This format included the raster image, but also included a mask or alpha channel for the main object. They marketed this format as an alternative to the sometimes larger formats of the day. GIF files didn’t have the color depth and PNG was new enough, Hemera was probably hoping this format would be the next greatest thing to happen to clip art.

A Hemera Photo-Object has the extension HPI. Lets take a closer look at a file and see what is under the hood. I pulled this file from Disc 1 on Archive.org

The HPI file has a unique header which should make identification really easy. But what do we see starting at offset 32? A JFIF! Just after a 32 byte header the file has a standard JPG file hidden inside. Now a standard JPG file does not have the ability to support an alpha channel so there must be something else they have within to mask this file. Lets look for the EOF file marker for the JPG format.

Well, well, well. It appears the JPG file is then followed by a standard PNG! Sneaky. The entire HPI file is a 32 byte HPI header, a JPG, followed by a PNG. One could easily carve out each of the formats and save as separate files if needed. There is a script you can use to do this for you, written by Ed Halley. The original Hemera software won’t run on modern systems.

Hemera had a good run for about 10 years before selling off their assets in 2004 to another stock image company. At one point Hemera even purchased the rights to all of Corel’s Premium photo library which I covered in my article about the Kodak PhotoCD format.

Corel ArtShow

File extensions are the easiest way to quickly identify a file format, but they can be misleading. This is the reason in Digital Preservation format identification tools like DROID are important to look closer at the file structure to more accurately identify formats. The other complication is some extensions are used for more than one format. Extensions like .DOC or .ISO can be used with many formats.

The PRONOM registry which DROID uses will list extensions associated with each format signature, but for some, they only have an extension and no signature. It’s nice to have an official ID to go with a format but with no signature it only matches based on extension.

This caused a problem awhile back for me while working with some files with the extension CDX. Which according to PRONOM, there are 5 completely different formats which use the extension, and probably others.

My CDX was related to some indexing software called Cindex. At the time the only format with a signature was for the WARC summary file CDX. The other was for a CorelDraw Compressed format with no signature. Confusing right? When I would run format identification on my Cindex files, they would default to the CorelDraw Compressed format, identified by extension. It was easy enough to create a signature for the Cindex format as I had enough samples to know the patterns needed for correct identification. But I was curious about the CorelDraw format. Should be easy to find, right?

Wrong. Finding a sample of this format was very elusive. All I had to go by was the name given to the format by PRONOM and the extension. I scoured every Corel CD and image I could get my hands on. For months I looked and could never find a single CDX file. Each CorelDraw software I was able to run did not have any ability to save in the CDX format. I scoured clipart discs, other Corel software, like Designer, PrintHouse, Photo-Paint, nada, nothing. I started to wonder if the format even existed. That’s when I noticed in the filters included with CorelDraw a reference to the ability to import a CDX but not write to one.

[CDX]
Signature=CORELFILTER - A
FilterEntry=1
Description=CorelDRAW Compressed (CDX)
FilterFullName=CorelDRAW Import Filter
Version=Version 6.00
Company=Corel Corporation
Copyright=Copyright © 1988-1995 Corel Corporation
Extensions=*.CDX
CorelID=0x704
FilterCapability1=0x9000
FilterCapability2=0x0
NoOfCompressions=0

This led to me finding a reference on the old Corel FTP site for knowledge base number 4550.

It mentioned something called ArtShow, where version 5 supported the file format CDX. ArtShow was a gallery of winning designs released on a CD-ROM and book each year. The first one being ArtShow 91, then ArtShow 3, 4, 5, 6, and finally 7 was the last. Each one released used a different proprietary compressed format for storing all the designs, these formats exist nowhere else. The question remains, why didn’t they use other popular Corel formats like CDR, CMX, or CCX which were used on many other clip art titles.

It took some time but I was finally able to find copies of a few of the Artshow CD-ROM discs, especially numbers 5 & 6. Which had the CDX format and the second generation CPX formats.

Each format had a easy to recognize header making a PRONOM signature easy to create. PRONOM already had the PUID for the two formats CDX & CPX, so sending in the signature added to the registry and hopefully will help distinguish between all the CDX formats!