BE DEAD

April 24, 2026 by Thor Leave a comment

If you remember the older post about Cafe Beef, you’ll appreciate the file format we explore in this post which uses using the Hex values “BE DEAD”. I guess they jinxed themselves because the software didn’t survive a refresh in 2009 and died. At one point the software was considered remarkable software being awarded 4.5 Mice by Macworld Magazine in August 2002.

When a colleague reach out to me recently with a file they were not familiar with I jumped in. I love a good challenge. The file had no extension, but was thought to have come from a Windows system. With a little digging I was able to identify the file as a Now Contact file which does have a Windows and Macintosh version, but with no extension, my money was on the file coming from the Mac.

I started my search with the obvious, the first few bytes. Since I only had one file, I wasn’t sure if this would be helpful, but looking at the bytes, I figured it was significant.

% hexdump -C "CONTACT FILE" | head
00000000  be de ad 01 00 00 00 03  00 1d 5f a9 00 7d d1 8e  |.........._..}..|
00000010  00 00 0e d8 98 89 d7 4b  00 8f 31 fd 00 00 3b f0  |.......K..1...;.|
00000020  00 de 76 56 be de ad 00  e6 02 b4 af 63 64 62 68  |..vV........cdbh|
00000030  00 00 00 00 00 00 01 8e  90 8f 56 13 00 0d 09 b4  |..........V.....|
00000040  00 0d 0b e0 00 0d 0c 50  00 0d 0b e8 00 0d 0c 58  |.......P.......X|
00000050  00 00 00 00 ba f3 fa eb  00 00 27 12 00 00 00 00  |..........'.....|
00000060  e6 02 b4 76 00 00 00 00  00 00 00 00 00 00 00 00  |...v............|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

The first three bytes are “BE DE AD“, BE DEAD seems to be done on purpose. A quick search on the web showed no results, no mention of this unique header. I even turned to AI, asking grok if it know the source of this byte sequence. It had no idea. I began digging through the file looking for clues to its software source. The ASCII text I could see indicated some sort of customer database, that along with the file name of “CONTACT FILE” seemed to confirm. I found some dates from 2002 and started looking at popular CRM and PIM software at the time. I then found a reference to a note the user left saying they opened the file on a different Power Tower Pro. I owned one of these clones back in college, so I immediately knew they were using a Macintosh! A quick search of popular contact management software from the early 2000’s revealed a few suspects. I took a look at a product from Now Software, Now Up-to-Date & Contact version 3.9 and I found the header I was looking for! Had the file sent to me retained its extended attributes from the Mac, I would have found this software much quicker.

Now Software has been around since 1990 and was purchased at one point by PowerOn Software. Now Up-to-Date came around in 1992, but Now Contact wasn’t added until 1994. Version 1.0 of the software was standalone and was popular, but simple, when Now Software bundled it with the Now Up-to-Date software in 1995, they skipped version 2 to be in sync.

The Now Contact software has a few functionalities including a Word Processor, but lets stick to the contact manager for now. Let’s take a look at a sample file from version 1.

% hexdump -C "Sample Contact File" | head 
00000000  00 00 4c 47 00 00 00 03  a8 ee f6 c6 a8 ef 9b 95  |..LG............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  12 bc 00 2b 73 74 61 74  00 00 00 01 00 00 00 04  |...+stat........|
00000110  00 00 00 fc 73 74 61 74  00 00 00 02 00 00 01 00  |....stat........|
00000120  00 00 06 44 4b 6e 44 42  00 00 00 00 00 00 07 44  |...DKnDB.......D|
00000130  00 00 04 68 4b 6e 44 42  00 00 00 01 00 00 0b ac  |...hKnDB........|
00000140  00 00 06 12 4b 6e 44 42  00 00 00 02 00 00 13 ac  |....KnDB........|
00000150  00 00 00 70 4b 6e 44 42  00 00 00 03 00 00 15 ac  |...pKnDB........|
00000160  00 00 00 b6 4b 6e 44 42  00 00 00 05 00 00 16 64  |....KnDB.......d|

This file does not have the “BE DE AD” header, but something else. I do see a repeated pattern of the text “KnDB” which also happens to be the Type code used on the Macintosh.

% getfileinfo "Sample Contact File"
type: "KnDB"
creator: "NIC!"
attributes: avbstClinmedz
created: 10/23/1993 13:57:10
modified: 10/24/1993 11:18:58

Another sample

% hexdump -C NC1-s02 | head
00000000  00 00 36 20 00 00 00 03  e6 03 e7 1a e6 03 e7 41  |..6 ...........A|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  0c dc 00 26 73 74 61 74  00 00 00 01 00 00 00 04  |...&stat........|
00000110  00 00 00 fc 73 74 61 74  00 00 00 02 00 00 01 00  |....stat........|
00000120  00 00 06 44 4b 6e 44 42  00 00 00 00 00 00 07 44  |...DKnDB.......D|
00000130  00 00 04 68 4b 6e 44 42  00 00 00 01 00 00 0b ac  |...hKnDB........|
00000140  00 00 00 00 4b 6e 44 42  00 00 00 02 00 00 0b ac  |....KnDB........|
00000150  00 00 00 20 4b 6e 44 42  00 00 00 03 00 00 0b cc  |... KnDB........|
00000160  00 00 00 b6 4b 6e 44 42  00 00 00 05 00 00 0c 84  |....KnDB........|

These version 1 files don’t seem to have a static header, but they do have common bytes sequences. I will need to make more samples to get a proper signature constructed.

Now Contact skipped version 2 so the next version to be released was 3.0. What do these files look like?

% hexdump -C "Sample Contact File" | head
00000000  be de ad 01 00 00 00 03  00 00 a7 62 00 00 03 fc  |...........b....|
00000010  00 00 00 19 00 01 6e 9a  00 00 36 94 00 00 00 55  |......n...6....U|
00000020  00 01 42 ea be de ad 00  b4 25 c0 65 00 01 63 6b  |..B......%.e..ck|
00000030  00 00 00 87 00 00 00 6c  fc 63 8e a8 6f 62 6a 65  |.......l.c..obje|
00000040  00 00 00 81 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 01 00 00 00 80 00 00  00 36 63 6d 6e 74 00 00  |.........6cmnt..|
00000060  00 80 00 00 00 00 00 00  00 00 00 00 00 00 00 03  |................|
00000070  63 64 61 74 00 00 00 04  b4 25 c0 65 63 74 68 74  |cdat.....%.ectht|
00000080  00 00 00 00 63 73 65 6c  00 00 00 04 00 00 00 00  |....csel........|
00000090  be de ad 00 b4 25 c0 a4  44 4c 54 7a 00 00 00 ed  |.....%..DLTz....|

They have the same header as the file I received. Let’s try and open my file in Now Contact 3.9.

Oops, that didn’t work. There must be something in my file which tells the software it is from a newer version. After some digging in the file I can see some possible version text.

% hexdump -C "CONTACT FILE"
00000000  be de ad 01 00 00 00 03  00 1d 5f a9 00 7d d1 8e  |.........._..}..|
00000010  00 00 0e d8 98 89 d7 4b  00 8f 31 fd 00 00 3b f0  |.......K..1...;.|
00000020  00 de 76 56 be de ad 00  e6 02 b4 af 63 64 62 68  |..vV........cdbh|
00000030  00 00 00 00 00 00 01 8e  90 8f 56 13 00 0d 09 b4  |..........V.....|
00000040  00 0d 0b e0 00 0d 0c 50  00 0d 0b e8 00 0d 0c 58  |.......P.......X|
00000050  00 00 00 00 ba f3 fa eb  00 00 27 12 00 00 00 00  |..........'.....|
00000060  e6 02 b4 76 00 00 00 00  00 00 00 00 00 00 00 00  |...v............|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000003a0  00 50 00 21 00 f1 01 cf  00 00 00 82 00 00 00 34  |.P.!...........4|
000003b0  6b 65 79 73 00 00 00 82  00 00 00 00 00 00 00 00  |keys............|
000003c0  00 01 00 00 00 01 6b 65  79 68 00 00 00 15 76 34  |......keyh....v4|
000003d0  30 30 00 00 00 01 00 00  00 00 00 00 00 00 00 00  |00..............|
000003e0  00 00 00 00 be de ad 00  ba 42 cd 51 05 00 64 62  |.........B.Q..db|
000003f0  00 00 00 02 00 00 00 18  00 00 00 00 00 00 00 00  |................|
00000400  00 be de ad 00 b9 d0 15  aa 02 00 66 6c 00 00 00  |...........fl...|
00000410  14 00 00 00 2d b1 0b 27  34 76 34 30 30 00 00 00  |....-..'4v400...|

The file has some repeated text with v400. Sure enough opening the file in version 4 has no problems and I am able to view all the contacts and even allows me to export as a CSV. Looking at a sample file from a version 4 install confirms the version information.

% hexdump -C "Sample Contact File" | head
00000000  be de ad 01 00 00 00 03  00 07 5c 06 00 01 23 40  |..........\...#@|
00000010  00 00 00 10 00 17 9c 42  00 01 10 92 00 00 00 63  |.......B.......c|
00000020  00 10 37 3a be de ad 00  b7 39 82 7b 00 01 63 6b  |..7:.....9.{..ck|
00000030  00 00 00 80 00 00 5d 62  52 66 a1 a9 6f 62 6a 65  |......]bRf..obje|
00000040  00 00 00 86 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 06 00 00 00 81 00 00  00 34 6b 65 79 73 00 00  |.........4keys..|
00000060  00 81 00 00 00 00 00 00  00 00 00 00 00 00 00 01  |................|
00000070  6b 65 79 68 00 00 00 15  76 34 30 30 00 00 00 01  |keyh....v400....|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000090  00 82 00 00 00 b4 6e 6f  74 65 00 00 00 82 00 00  |......note......|

Now Software updated the software for the a few years in the early 1990’s. There was Windows versions as well and the format is the same except one detail.

% hexdump -C NC452-Win-s01.NCT | tail
00004ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 08 00 32  |...............2|
00005000  00 32 01 90 01 90 00 00  00 08 00 4f 00 32 02 01  |.2.........O.2..|
00005010  03 11 00 00 01 00 00 00  01 18 00 00 00 18 00 00  |................|
00005020  00 32 00 00 00 00 00 00  00 00 00 1c 00 32 00 00  |.2...........2..|
00005030  64 65 52 65 00 00 00 0a  00 01 ff ff 00 00 00 0c  |deRe......??....|
00005040  00 00 00 00 00 00 4e fa  69 da 88 7b 00 00 50 54  |......N?i?.{..PT|
00005050  77 69 6e 73                                       |wins|

It appears in version 4, the final bytes would indicate “wins” or “macs”. This continued in version 5 which came out in 2005.

% hexdump -C NC501-s01.nct | head
00000000  be de ad 01 00 00 00 03  00 00 02 cf 00 00 1e 14  |?ޭ........?....|
00000010  00 00 00 11 00 00 5c 93  00 00 43 7e 00 00 00 50  |......\...C~...P|
00000020  00 00 59 da 00 00 00 34  00 00 00 2c 00 00 05 5e  |..Y?...4...,...^|
00000030  00 00 02 cc be de ad 00  e6 02 e9 89 01 00 64 62  |...̾ޭ.?.?...db|
00000040  00 00 00 02 00 00 00 18  00 00 00 00 be de ad 00  |............?ޭ.|
00000050  e6 02 e9 91 63 64 62 68  00 00 00 00 00 00 01 8e  |?.?.cdbh........|
00000060  be 63 10 4d 00 7f 63 c8  00 7f 65 64 00 7f 66 d0  |?c.M..c?..ed..f?|
00000070  00 7f 66 bc 00 00 00 00  00 01 00 00 e6 02 e9 8a  |..f?........?.?.|
00000080  00 00 27 11 00 00 00 00  e6 02 e9 91 00 00 00 00  |..'.....?.?.....|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00005ad0  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00005ae0  01 00 00 00 00 00 00 00  00 1e 00 00 00 00 00 00  |................|
00005af0  00 00 00 1c 00 1e ff ff  00 00 00 00 00 00 00 00  |......??........|
00005b00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 08  |................|
00005b10  00 32 00 32 02 4f 03 11  00 00 59 da 57 7e a0 f3  |.2.2.O....Y?W~??|
00005b20  00 00 5b 28 6d 61 63 73                           |..[(macs|

Also in the version 5 samples, we still see the v400 text, so it appears the format was not changed.

% hexdump -C /Volumes/File\ Formats/Now/NC531-s01.nct       
00000000  be de ad 01 00 00 00 03  00 00 04 90 00 00 1e 14  |?ޭ.............|
00000010  00 00 00 11 00 00 54 83  00 00 43 7e 00 00 00 50  |......T...C~...P|
00000020  00 00 53 5e 00 00 00 34  00 00 00 2c 00 00 05 5e  |..S^...4...,...^|
00000030  00 00 02 cc be de ad 00  e6 04 44 0e 01 00 64 62  |...̾ޭ.?.D...db|
00000040  00 00 00 02 00 00 00 18  00 00 00 00 be de ad 00  |............?ޭ.|
00000050  e6 04 44 5c 63 64 62 68  00 00 00 00 00 00 01 8e  |?.D\cdbh........|
00000060  41 f7 ad 20 cc 8e b5 01  74 8f b5 01 e8 00 b6 02  |A?? ?.?.t.?.?.?.|
00000070  e4 00 b6 02 00 00 00 00  01 00 00 00 0e 44 04 e6  |?.?..........D.?|
00000080  11 27 00 00 00 00 00 00  5c 44 04 e6 00 00 00 00  |.'......\D.?....|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 be de  |..............??|
000001f0  ad 00 e6 04 44 0e 02 00  66 6c 00 00 00 03 00 00  |?.?.D...fl......|
00000200  00 2d b1 0b 27 34 76 34  30 30 00 00 00 01 00 00  |.-?.'4v400......|
00000210  00 00 00 00 00 00 00 00  00 00 00 be de ad 00 e6  |...........?ޭ.?|
00000220  04 44 0e 02 00 66 6c 00  00 00 05 00 00 00 2d b1  |.D...fl.......-?|
00000230  0b 27 34 76 34 30 30 00  00 00 01 00 00 00 00 00  |.'4v400.........|

Now Up-to-Date & Contact released version 5.3 around 2008 which finally provided support for Intel processors. It was the last version released before Now Software attempted a full re-write of the software in 2009 named Now X (code-named “NightHawk”). The software did not receive good reviews and by 2010 the company ceased operations. So far I have come up empty in getting a copy of this doomed version, but I will update this post if I am able to get my hands on a copy.

For now, you can take a look at some sample files on Github, which I will also add some PRONOM signatures to soon.

iView

February 13, 2026 by Thor Leave a comment

It seems to be a common theme through the history of software that some titles, get bought, sold, rebranded, integrated, and discontinued by a number of companies. I find it interesting to find out a popular software title’s humble beginnings. Often when a piece of software gets bought, the file formats don’t change much, at least at first.

A little shareware program called iView started out by a company called Script Software in 1996. They later changed their name to Plum Amazing. iView then became iView Multimedia, then an iView MediaPro version before it was bought by Microsoft where they changed the name to Expression Media. After a couple years the software was bought by Phase One and then discontinued. Let’s take a look at the history.

iView, according to their website in 1997, is simply the easiest and fastest way to view and catalog pictures for the Mac. The software initially only worked on the Macintosh and the Catalog file it produced did not have an extension. But they did have a Type/Creator code. A catalog produced by version 2 of the iView software was IVWc/IVW2.

% hexdump -C iView2-s01 | head
00000000  00 00 00 05 30 32 35 69  47 4f 53 58 3a 4c 69 62  |....025iGOSX:Lib|
00000010  72 61 72 79 3a 41 70 70  6c 69 63 61 74 69 6f 6e  |rary:Application|
00000020  20 53 75 70 70 6f 72 74  3a 41 70 70 6c 65 3a 69  | Support:Apple:i|
00000030  43 68 61 74 20 49 63 6f  6e 73 3a 46 72 75 69 74  |Chat Icons:Fruit|
00000040  3a 47 72 65 65 6e 20 41  70 70 6c 65 2e 67 69 66  |:Green Apple.gif|
00000050  03 46 44 63 00 00 0f ef  03 46 44 63 08 93 65 58  |.FDc.....FDc..eX|
00000060  00 01 5c 50 00 01 5a c8  68 ff f7 40 08 93 65 4b  |..\P..Z.h..@..eK|
00000070  08 13 9a c0 ff d1 3a 80  00 a3 c8 a0 00 00 28 00  |......:.......(.|
00000080  00 05 48 64 00 00 a0 24  00 00 39 ec 00 00 00 0a  |..Hd...$..9.....|
00000090  08 93 65 64 44 00 00 24  3d 14 51 84 3d 9d 74 bc  |..edD..$=.Q.=.t.|

The iView format is a proprietary binary format used to store a catalog of multimedia formats with their metadata and thumbnail. The media viewer had support for quite a few popular formats. The file seems to have paths to each of the files it has cataloged, so some of these iView files can get pretty large.

In 2003 the iView software was ported to Windows. With that brought a formal extension to the catalog format. This was also the time the iView software made the switch from the classic MacOS to MacOSX and extensions were also encouraged at this time. iView had two different version a standard shareware version and a Media Pro version, each had their own version numbers. iView MediaPro was not compatible with Macintosh 68K machines or systems earlier than 8.6. The last Media Pro version was version 3.8.6. You can get most of the old software versions here.

% hexdump -C iViewPro302-s01.ivc | head
00000000  00 00 00 00 30 32 35 69  46 53 4d 21 00 00 00 2e  |....025iFSM!....|
00000010  66 6c 64 72 00 00 00 2e  00 00 00 00 00 00 00 06  |fldr............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  3c 72 6f 6f 74 3e 42 4c  |........<root>BL|
00000040  44 4f 00 00 00 0c 31 00  02 00 00 00 01 01 00 00  |DO....1.........|
00000050  00 00 55 53 46 33 00 00  00 02 01 03 43 4d 52 53  |..USF3......CMRS|
00000060  00 00 01 ed 01 00 00 02  0a 01 00 00 00 00 00 00  |................|
00000070  00 02 f2 01 00 00 00 00  00 00 00 00 a2 01 00 00  |................|
00000080  00 00 02 01 03 00 00 00  a1 01 00 00 00 00 00 00  |................|
00000090  00 00 48 00 00 00 00 00  00 00 00 00 03 01 00 00  |..H.............|

This time with an extension, IVC, but with a familiar pattern at the beginning. The string 025i, hex values “30323569” at byte 4. The iView files from previous versions have the same bytes, but only version Media Pro 2 & 3 files match an existing PRONOM identification.

% sf iViewPro302-s01.ivc 
filename : 'iViewPro302-s01.ivc'
filesize : 3757
modified : 2025-09-17T17:39:27-06:00
errors   : 
matches  :
  - ns      : 'pronom'
    id      : 'fmt/647'
    format  : 'Microsoft Expression Media'
    version : '2'
    mime    : 
    class   : 'Presentation'
    basis   : 'extension match ivc; byte match at [[4 4] [3737 16]]'

These are iView Media Pro files, why are they identifying as Microsoft Expression Media files? That is because Microsoft bought iView Media Pro on June 27, 2006. Microsoft rebranded the software as Expression Media, not to be confused with Expression Studio. It was available for Windows and Macintosh, but not everyone was happy with the purchase. Version 1 of Expression Media was released the next year and was a free upgrade for iView Media Pro users. The format doesn’t appear to have changed much at all. In fact a comparison of an iView Media Pro 3 file with no content and an Expression Media 1 file are practically identical.

% hexdump -C Expression1-s01.ivc | head
00000000  00 00 00 00 30 32 35 69  46 53 4d 21 00 00 00 2e  |....025iFSM!....|
00000010  66 6c 64 72 00 00 00 2e  00 00 00 00 00 00 00 06  |fldr............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  3c 72 6f 6f 74 3e 42 4c  |........<root>BL|
00000040  44 4f 00 00 00 0c 31 00  02 00 00 00 01 01 00 00  |DO....1.........|
00000050  00 00 55 53 46 33 00 00  00 02 01 03 43 4d 52 53  |..USF3......CMRS|
00000060  00 00 01 ed 01 00 00 02  0a 01 00 00 00 00 00 00  |................|
00000070  00 02 f2 01 00 00 00 00  00 00 00 00 a2 01 00 00  |................|
00000080  00 00 02 01 03 00 00 00  a1 01 00 00 00 00 00 00  |................|
00000090  00 00 48 00 00 00 00 00  00 00 00 00 03 01 00 00  |..H.............|

The next year brought a version 2 of Expression Media, often found bundled with a Special Edition of Office 2008 for Mac, but also a standalone product for Windows. But the catalog format remained the same.

% hexdump -C Expression2-s01.ivc | head       
00000000  00 00 00 04 30 32 35 69  3a 43 3a 5c 44 4f 43 55  |....025i:C:\DOCU|
00000010  4d 45 7e 31 5c 41 4c 4c  55 53 45 7e 31 5c 44 4f  |ME~1\ALLUSE~1\DO|
00000020  43 55 4d 45 7e 31 5c 4d  59 50 49 43 54 7e 31 5c  |CUME~1\MYPICT~1\|
00000030  53 41 4d 50 4c 45 7e 31  5c 57 69 6e 74 65 72 2e  |SAMPLE~1\Winter.|
00000040  6a 70 67 00 00 00 00 00  00 00 00 00 00 00 00 00  |jpg.............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Even though all of these versions have the same 4 bytes at the beginning, not all of them match the current PRONOM signature. fmt/647 is specifically for Expression Media version 2 files, but also identifies iView Media Pro 2 & 3 and Expression Media 1 files. It doesn’t identify earlier files because the signature is also looking for some bytes near the end of the file.

% hexdump -C iViewPro302-s01.ivc | tail       

00000e90  00 00 00 00 00 00 00 00  00 53 56 61 72 00 00 00  |.........SVar...|
00000ea0  04 00 00 01 f4 30 32 35  69 00 00 00 08           |.....025i....|

There is the same 4 bytes at the end of the file as well. There is also a string used in the signature at the end, “SVar”. Not sure what the string is used for but it is not in earlier versions.

% hexdump -C iView157-01 | tail 

00000420  00 00 00 00 00 00 00 00  00 00 00 00 30 32 35 69  |............025i|
00000430  00 00 00 08                                       |....|

And the even earlier versions are missing the “025i” at the end.

% hexdump -C iView2-s01 | tail

000062b0  2a ae ed d4 1a eb d4 04  c4 88 76 88 c4 d6 d4 04  |*.........v.....|
000062c0  c4 79 69 79 c4 d6 d4 04  c4 78 67 78 c4 ec d4 04  |.yiy.....xgx....|
000062d0  81 d4 f1 d4 00 ff                                 |......|

Microsoft Expression Media was short lived. Microsoft decided to sell off the software to Phase One in 2010. Phase One is the developer of Capture One, a professional photo editing program. It makes sense they would want a cataloging tool to go with their flagship product. Phase One retained the name Media Pro from the original iView Media Pro software.

Phase One took the software and did make modifications, starting with the extension used to store the catalogs. They also decided to adjust the format slightly, changing the “025i” bytes to “030i”.

% hexdump -C PhaseOneMediaProv1.mpcatalog | head 
00000000  00 00 00 05 30 33 30 69  4a 4d 61 63 31 30 37 3a  |....030iJMac107:|
00000010  4c 69 62 72 61 72 79 3a  41 70 70 6c 69 63 61 74  |Library:Applicat|
00000020  69 6f 6e 20 53 75 70 70  6f 72 74 3a 41 70 70 6c  |ion Support:Appl|
00000030  65 3a 69 43 68 61 74 20  49 63 6f 6e 73 3a 46 72  |e:iChat Icons:Fr|
00000040  75 69 74 3a 47 72 65 65  6e 20 41 70 70 6c 65 2e  |uit:Green Apple.|
00000050  67 69 66 00 00 00 00 00  00 00 00 00 00 00 00 00  |gif.............|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

The Phase One Media Pro software uses the extension MPCATALOG, but can also open the older IVC catalogs as well.

% sf PhaseOneMediaProv1.mpcatalog 

filename : 'PhaseOneMediaProv1.mpcatalog'
filesize : 21353
modified : 2025-09-16T20:37:07-06:00
errors   : 
matches  :
  - ns      : 'pronom'
    id      : 'fmt/648'
    format  : 'Media View Pro'
    version : 
    mime    : 
    class   : 'Presentation'
    basis   : 'extension match mpcatalog; byte match at [[4 4] [21329 16]]'

MPCATALOG files are identified in PRONOM using a similar signature as the one used for the IVC files. Although the name of the format isn’t quite right, MediaPro is probably a better name.

So it seems the identification is already available in PRONOM for the later MediaPro files, both iView MediaPro and Expression Media, and a second identification for the PhaseOne catalog. So we will need to either adjust the identification to include the earlier iView versions and adjust the names or we can create a new signature for the older versions. It would be good to find out what version added the change to the format, but with all the different software versions, it might be hard to nail down.

Enjoy some samples.

Scrivener

March 21, 2025 by Thor 1 Comment

Word Processors are everywhere and have some of the most recognizable file formats. Some are very simple in that they just contain plain text, others are more complex. There are formats which allow for images and others which can handle different languages and writing directions.

A writing platform I recently learned about is called Scrivener. It was first released in 2007 by a company called Literature & Latte Ltd, and has a Macintosh and Windows version. The software is marketed toward writers as there is some features that help with note taking, research and much more. It also allows for adding multimedia and even full webpages.

This is accomplished by a file format which uses a non-traditional method for storing all the data needed to render the format.

tree Scrivener3-s01.scriv
Scrivener3-s01.scriv
├── Files
│   ├── Data
│   │   ├── 921B4A08-54C0-4B69-94FD-428F56FDAB89
│   │   │   └── content.rtf
│   │   └── docs.checksum
│   ├── binder.autosave
│   ├── binder.backup
│   ├── search.indexes
│   ├── styles.xml
│   ├── version.txt
│   └── writing.history
├── Scrivener3-s01.scrivx
└── Settings
    ├── recents.txt
    ├── ui-common.xml
    └── ui.ini

Scrivener uses a folder structure to store all the data used in the format. The folder has an extension, .scriv. The format includes some rich text, backups, indexes, version history and more. One unique format within the folder is an XML file with the extension .scrivx. This makes the format proprietary and can only be rendered using the Scrivener software.

cat Scrivener3-s01.scrivx | head
<?xml version="1.0" encoding="UTF-8"?>
<ScrivenerProject Template="No" Version="2.0" Identifier="DF5DA7F0-27DB-4815-A050-B4D6F23CABA7" Creator="SCRWIN-3.1.5.1" Device="DESKTOP-JMM4K7M" Modified="2025-03-14 22:15:28 -0600" ModID="B4A944C3-FF79-49F6-A737-158BEB4E58BB">
    <Binder>
        <BinderItem UUID="17807D28-117A-409E-B12D-B34922B6CC6F" Type="DraftFolder" Created="2025-03-14 22:15:17 -0600" Modified="2025-03-14 22:15:17 -0600">
            <Title>Draft</Title>
            <MetaData>
                <IncludeInCompile>Yes</IncludeInCompile>
            </MetaData>
            <Children>
                <BinderItem UUID="921B4A08-54C0-4B69-94FD-428F56FDAB89" Type="Text" Created="2025-03-14 22:15:17 -0600" Modified="2025-03-14 22:15:23 -0600">

The XML has enough to be able to identify them apart from other XML files. The signature would be straight forward. Earlier versions of Scrivener sometimes have the SCRIVX file but also sometimes has a
.scrivproj extension. This file on a Macintosh is in a Binary plist format, which is different than earlier Windows versions. Seems they may have unified them under version 2 or 3, where version 1 & 2 for Windows uses Project version 1 and version 3 uses project version 2.

hexdump -C Scrivener1-s01.scriv/binder.scrivproj | head
00000000  62 70 6c 69 73 74 30 30  d4 00 01 00 02 00 03 00  |bplist00........|
00000010  04 00 05 00 1d 01 d8 01  d9 54 24 74 6f 70 58 24  |.........T$topX$|
00000020  6f 62 6a 65 63 74 73 58  24 76 65 72 73 69 6f 6e  |objectsX$version|
00000030  59 24 61 72 63 68 69 76  65 72 dc 00 06 00 07 00  |Y$archiver......|
00000040  08 00 09 00 0a 00 0b 00  0c 00 0d 00 0e 00 0f 00  |................|
00000050  10 00 11 00 12 00 13 00  14 00 15 00 16 00 17 00  |................|
00000060  18 00 19 00 1a 00 15 00  1b 00 1c 5a 4c 61 62 65  |...........ZLabe|
00000070  6c 54 69 74 6c 65 59 4c  61 62 65 6c 4c 69 73 74  |lTitleYLabelList|
00000080  5e 42 69 6e 64 65 72 43  6f 6e 74 65 6e 74 73 5f  |^BinderContents_|
00000090  10 0f 44 65 66 61 75 6c  74 4c 61 62 65 6c 54 61  |..DefaultLabelTa|

Since the developers of Scrivener decided to make the SCRIV format simply a folder with different content within, something special happens on the MacOS. The Scrivener software registers all the extensions is uses with the MacOS launch services. This process then changes the way the SCRIV folder is displayed in the MacOS Finder. They now appears as a single file and given a file type. This is called a Document Package format.

By right-clicking on the “file” you can then browse the package contents. There is nothing in the folder itself or hidden in any attributes which causes this to happen, it is all controlled by what extensions have been registered with the launch services database. We can however ask the MacOS to give us some extended metadata details about the package, as long as the file is on a Apple filesystem like HFS or APFS.

mdls Scrivener3-s01.scriv 
_kMDItemDisplayNameWithExtensions      = "Scrivener3-s01.scriv"
kMDItemContentCreationDate             = 2025-03-15 04:15:17 +0000
kMDItemContentCreationDate_Ranking     = 2025-03-15 00:00:00 +0000
kMDItemContentModificationDate         = 2025-03-15 04:15:18 +0000
kMDItemContentModificationDate_Ranking = 2025-03-15 00:00:00 +0000
kMDItemContentType                     = "com.literatureandlatte.scrivener3.scriv"
kMDItemContentTypeTree                 = (
    "com.literatureandlatte.scrivener3.scriv",
    "public.directory",
    "public.item",
    "com.apple.package",
    "public.content",
    "public.composite-content"
)
kMDItemDateAdded                       = 2025-03-21 04:38:48 +0000
kMDItemDateAdded_Ranking               = 2025-03-21 00:00:00 +0000
kMDItemDisplayName                     = "Scrivener3-s01.scriv"
kMDItemDocumentIdentifier              = 0
kMDItemFSContentChangeDate             = 2025-03-15 04:15:18 +0000
kMDItemFSCreationDate                  = 2025-03-15 04:15:17 +0000
kMDItemFSCreatorCode                   = ""
kMDItemFSFinderFlags                   = 0
kMDItemFSHasCustomIcon                 = (null)
kMDItemFSInvisible                     = 0
kMDItemFSIsExtensionHidden             = 0
kMDItemFSIsStationery                  = (null)
kMDItemFSLabel                         = 0
kMDItemFSName                          = "Scrivener3-s01.scriv"
kMDItemFSNodeCount                     = 3
kMDItemFSOwnerGroupID                  = 20
kMDItemFSOwnerUserID                   = 501
kMDItemFSSize                          = 31155
kMDItemFSTypeCode                      = ""
kMDItemInterestingDate_Ranking         = 2025-03-15 00:00:00 +0000
kMDItemKind                            = "Scrivener Project"
kMDItemLogicalSize                     = 31155
kMDItemPhysicalSize                    = 69632

There is a lot of additional details available using the MDLS command, this includes the content type of “com.apple.package“. This tools works with any files in MacOS and can be a very useful tool in getting all the information you may need for preservation needs.

Until the tools we use for format identification can recognize package formats, tools like this may be needed to gather the neccessary metadata for preservation. But in the meantime, identification of the package content is the best we can hope for. Creating a signature for the XML based SCRIVX format is the first step.

Stay tuned for more on the package format as I will be bring it up more in the Digital Preservation community.

HFE

September 27, 2024 by Thor 2 Comments

Last week I had the pleasure of attending the 20th annual iPres conference on Digital Preservation in Ghent, Belgium. I enjoyed hearing from many of my respected colleagues on many aspects of preservation including one of my favorite topics, floppy disks. There was tutorials, lightning talks, and even a workshop, presented by Leontien Talboom, Elizabeth Kata, Chris Knowles, and myself. We titled the workshop “A Guide to Imaging Obscure Floppy Disk Formats“. The workshop was conceived by a mutual interest in imaging Wang 5.25in word processor disks, but expanded to include imaging of Amstrad 3in disks, 240K Brother Typewriter Disks, and Macintosh 400/800k disks.

I brought my hand soldered FluxEngine board and others brought their Greaseweazle board to show off how imaging obscure and uncommon disks can be done on a budget.

Photo of workshop taken on a Mavica Floppy Disk camera — Image taken during workshop on a Mavica FD200 Floppy Disk Camera.

During the conference we talked a bit about the different type of hardware that can be used and the difference between a disk image and flux image. There seems to be quite the exhaustive list of different types of file formats, some specific to a platform and others more generic. I recently did a blog post on the formats used by the Applesauce software, which have some unique features.

There are many disk image types which should be researched and added to PRONOM and other format description sites, but today lets take a look at a generic format used by many tools.

The HxC Floppy Emulator file format which the extension HFE is a popular format used with floppy drive emulators. There is a lot of complexity with what is included in many of these image formats, some are simply a raw sector representation of the binary data on a disk, others contain the complete flux readings from a floppy disk. The HFE format contains a little more than a raw image, including a header, a track lookup table, and the bitstreams for each track all with the purpose of emulating the physical media. The HFE format contains only a single pass over the data, where other formats may contain multiple reading of each track to get more complete data which can be helpful for damaged or purposely copy-protected disks. You can read more on Ashley’s blog, Library of Congress format description.

When using the HxC Floppy Emulator software, you can open and save to many different formats. The main format being their HFE native format. It comes in 5 versions.

hexdump -C test01.hfe | head
00000000  48 58 43 50 49 43 46 45  00 53 02 00 e8 01 00 00  |HXCPICFE.S......|
00000010  07 01 01 00 ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

Above is a hexdump of the main SDCard HxC Floppy Emulator file format. The format specification shows the 8 byte header “HXCPICFE”. This is a very unique pattern and should be all we need to make a robust signature for the format, but we do need to take into account the other HFE “versions” and see if they might clash or need to be identified separately.

hexdump -C test02-a2.hfe | head 
00000000  48 58 43 50 49 43 46 45  00 53 02 00 d0 03 00 00  |HXCPICFE.S......|
00000010  07 01 01 00 ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

The “A2” version of the format has the same header but some different bytes further into the file.

hexdump -C test03-rev2.hfe | head
00000000  48 58 43 50 49 43 46 45  01 53 02 00 00 00 00 00  |HXCPICFE.S......|
00000010  07 01 01 00 ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

The “Rev 2” version also has the same header. But if you look at the 9th byte you can see the value changed from 00 to 01, which according to the specification, this is the revision byte.

hexdump -C test04-rev3.hfe | head 
00000000  48 58 43 48 46 45 56 33  00 53 02 00 e8 01 00 00  |HXCHFEV3.S......|
00000010  07 01 01 00 ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

With “Rev 3” we see a change in the header with “HXCHFEV3” which appears to be referred to as HFEv3.

hexdump -C test05-stream.hfe | head 
00000000  48 78 43 5f 53 74 72 65  61 6d 5f 49 6d 61 67 65  |HxC_Stream_Image|
00000010  00 00 00 00 00 00 00 00  00 18 00 00 00 02 00 00  |................|
00000020  00 1a 00 00 53 00 00 00  02 00 00 00 40 9c 00 00  |....S.......@...|
00000030  07 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

This last format seems to be a special HxC stream image.

It seems the best option is to make three signatures to identify the three main headers. Additional software can be used to further parse the disk image. If you would like to see some sample images, you can download a bunch here. You can also take a look at my GitHub repository to see additional samples and a proposed set of signatures.

A2R / MOOF / WOZ

August 16, 2024 by Thor 2 Comments

There seems to be a never ending growing list of disk image formats. Many have features which are specific to the media and format. If you have ever imaged an older Macintosh floppy you know they are special. If you add in copy-protection which many early Apple II floppies have, and you need special drives, hardware, and a special format to store the floppy data.

When imaging special media, especially with unique media, it is best practice to image the floppies at the magnetic flux level.

Floppy disks contain magnetic fluctuations which are measured and recorded using specialized equipment. A popular method is using a Kryoflux board, floppy drive, and software. The software communicates with a custom controller board connected to a floppy drive through USB. If you are interested in the different controller boards, a good list has been compiled here.

A Kryoflux, fluxengine, greaseweazle, all can image specialized disks like a Macintosh 800k floppy, but the best controller board for them is an Applesauce setup. They are specifically designed to for the task. With that task, comes a few specialty formats.

A file format which can store flux data is a bit different than a regular disk image format. The flux data contains all the low-level recordings which can then be interpreted into disk images much like the original floppy. In the case of an Applesauce flux image, it can contain all the small nuances of the original floppy, this includes recording any copy protection or other creative methods used by software vendors throughout the years. The format used for storing this flux data is the A2R format.

A2R is in its third iteration. Let’s take a look at the basics of the format.

hexdump -C Samplev3.a2r | head
00000000  41 32 52 33 ff 0a 0d 0a  49 4e 46 4f 25 00 00 00  |A2R3....INFO%...|
00000010  01 41 70 70 6c 65 73 61  75 63 65 20 76 31 2e 38  |.Applesauce v1.8|
00000020  38 2e 35 20 20 20 20 20  20 20 20 20 20 20 20 20  |8.5             |
00000030  20 02 01 01 00 52 57 43  50 e9 49 6e 01 01 24 f4  | ....RWCP.In..$.|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 43 01 00  |.............C..|
00000050  00 01 27 3a 25 00 91 d9  00 00 21 20 21 21 21 21  |..':%.....! !!!!|
00000060  1f 21 21 21 21 1f 24 5e  24 1f 21 21 20 21 24 5c  |.!!!!.$^$.!! !$\|
00000070  24 20 21 21 21 1f 24 5c  25 21 21 1f 21 21 23 5b  |$ !!!.$\%!!.!!#[|
00000080  25 20 21 21 21 1f 21 22  23 3f 41 3f 26 3e 43 3f  |% !!!.!"#?A?&>C?|
00000090  43 5f 41 27 3d 61 41 27  3d 61 3f 28 3e 61 3f 26  |C_A'=aA'=a?(>a?&|

hexdump -C Samplev2.a2r | head
00000000  41 32 52 32 ff 0a 0d 0a  49 4e 46 4f 24 00 00 00  |A2R2....INFO$...|
00000010  01 41 70 70 6c 65 73 61  75 63 65 20 76 31 2e 31  |.Applesauce v1.1|
00000020  2e 36 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |.6              |
00000030  20 02 01 01 53 54 52 4d  75 17 5d 01 00 01 e6 da  | ...STRMu.].....|
00000040  00 00 83 a9 12 00 12 1e  11 13 1e 13 1e 13 11 1f  |................|
00000050  21 1f 11 13 1c 14 1e 30  14 20 1e 14 1e 14 1c 14  |!......0. ......|
00000060  1c 13 11 20 21 1f 11 11  0f 13 1e 14 1c 14 2e 21  |... !..........!|
00000070  13 1e 13 1e 14 1e 11 11  20 21 1f 11 11 13 1e 1f  |........ !......|
00000080  13 20 30 21 11 11 0f 13  1e 13 11 30 1f 21 20 13  |. 0!.......0.! .|
00000090  11 30 1f 14 1e 30 14 1e  11 11 11 1e 13 11 1e 14  |.0...0..........|

The A2R format uses a chunk system to store the various pieces to the format. Earlier versions used a STRM Chunk to store all the raw flux data. Version 3 changed to a RWCP Chunk to store all the raw flux data. Applesauce uses a 2-pass imaging process, doing a rapid imaging to determine where on the media surface track data exists and then a second pass that captures longer durations for processing and error correction.

Once the full raw flux data has been captured that data can be interpreted as a disk image. The Applesauce software is able to make a regular disk image, a Disk Copy 4.2 file, which are well known and identify in PRONOM as fmt/625, but can also create a couple of special disk image formats which allow for special nuances on an original disk.

The WOZ Disk Image format is an offshoot of the Applesauce project. Capturing highly accurate bit data is of no use if you don’t have a container to hold the data. The WOZ format was designed to be able to contain every possible Apple ][ disk structure and layout. It can be so accurate that even copy protected software can’t tell that it isn’t an original disk.

The WOZ format has become very popular in the Apple II community and is ideal for emulating all the old games and software titles popular in the early 1980’s. You may have guessed where the name comes from. The internet archive has a large collection of WOZ disks in their WOZ-a-Day collection. The file format of a WOZ disk image is also a chunk based format similar to the A2R format, it has two versions. Let’s take a look.

hexdump -C WOZ 1.0/Blazing Paddles (Baudville).woz | head
00000000  57 4f 5a 31 ff 0a 0d 0a  f6 f5 92 d6 49 4e 46 4f  |WOZ1........INFO|
00000010  3c 00 00 00 01 01 00 01  01 41 70 70 6c 65 73 61  |<........Applesa|
00000020  75 63 65 20 76 30 2e 32  36 20 20 20 20 20 20 20  |uce v0.26       |
00000030  20 20 20 20 20 20 20 20  20 00 00 00 00 00 00 00  |         .......|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  54 4d 41 50 a0 00 00 00  00 00 ff 01 01 01 ff 02  |TMAP............|
00000060  02 02 ff 03 03 03 ff 04  04 04 ff 05 05 05 ff 06  |................|
00000070  06 06 ff 07 07 07 ff 08  08 08 ff 09 09 09 ff 0a  |................|
00000080  0a 0a ff 0b 0b 0b ff 0c  0c 0c ff 0d 0d 0d ff 0e  |................|
00000090  0e 0e ff 0f 0f 0f ff 10  10 10 ff 11 11 11 ff 12  |................|

hexdump -C WOZ 2.0/Blazing Paddles (Baudville).woz | head
00000000  57 4f 5a 32 ff 0a 0d 0a  21 da c2 c8 49 4e 46 4f  |WOZ2....!...INFO|
00000010  3c 00 00 00 02 01 00 01  01 41 70 70 6c 65 73 61  |<........Applesa|
00000020  75 63 65 20 76 31 2e 31  20 20 20 20 20 20 20 20  |uce v1.1        |
00000030  20 20 20 20 20 20 20 20  20 01 01 20 00 00 00 00  |         .. ....|
00000040  0d 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  54 4d 41 50 a0 00 00 00  00 00 ff 01 01 01 ff 02  |TMAP............|
00000060  02 02 ff 03 03 03 ff 04  04 04 ff 05 05 05 ff 06  |................|
00000070  06 06 ff 07 07 07 ff 08  08 08 ff 09 09 09 ff 0a  |................|
00000080  0a 0a ff 0b 0b 0b ff 0c  0c 0c ff 0d 0d 0d ff 0e  |................|
00000090  0e 0e ff 0f 0f 0f ff 10  10 10 ff 11 11 11 ff 12  |................|

Unlike a common disk image, a WOZ image contains more than the bits on the disk, it contains a mapping of all the tracks and the associated data, this is how it can even contain copy-protection usually only possible with a physical disk. The ‘TMAP’ chunk contains a track map and the ‘TRKS’ chunk contains all the data.

What the WOZ is for the Apple II, MOOF was made for the Macintosh. You may wonder what is with the funny name, but there is a long history around “Clarus the Dogcow”. I’m sure this factoid will help you impress your friends or win at trivia night. Again, the purpose of the special format for Macintosh disks is to allow for emulating disks, even with copy protection. You can also find quite the collection of old Macintosh software in the MOOF format on the Internet Archive, even emulate your favorite game, such as Dark Castle, which I played for hours as a kid. Also a chunk based format, let’s take a look at the header.

hexdump -C Dark Castle v1.0 - Disk 1.moof | head
00000000  4d 4f 4f 46 ff 0a 0d 0a  b5 75 f9 4e 49 4e 46 4f  |MOOF.....u.NINFO|
00000010  3c 00 00 00 01 01 00 01  10 41 70 70 6c 65 73 61  |<........Applesa|
00000020  75 63 65 20 76 31 2e 37  33 20 20 20 20 20 20 20  |uce v1.73       |
00000030  20 20 20 20 20 20 20 20  20 00 13 00 00 00 00 00  |         .......|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  54 4d 41 50 a0 00 00 00  00 ff 01 ff 02 ff 03 ff  |TMAP............|
00000060  04 ff 05 ff 06 ff 07 ff  08 ff 09 ff 0a ff 0b ff  |................|
00000070  0c ff 0d ff 0e ff 0f ff  10 ff 11 ff 12 ff 13 ff  |................|
00000080  14 ff 15 ff 16 ff 17 ff  18 ff 19 ff 1a ff 1b ff  |................|
00000090  1c ff 1d ff 1e ff 1f ff  20 ff 21 ff 22 ff 23 ff  |........ .!.".#.|

All three formats created for imaging and emulating Apple and Macintosh software are well documented and open. They are also well suited for preservation as they can contain extensive metadata in the INFO chunk which gives provenance information on the source of the files. The Applesauce software even has a camera to photograph the disk itself for archiving. All of this makes these formats great for preservation and emulation. Take a look at my proposal for a signature on my Github.

PROmotion

May 3, 2024 by Thor 1 Comment

The 1990’s was an amazing time for multimedia. Compared to what is possible today, the graphics were more simple but there were many software titles leading the charge in Animation. Macromedia Director, along with Flash, dominated the interactive multimedia market for quite some time. Eventually being picked up by Adobe and discontinued in 2013. Quite a few multimedia disc’s out there were built using Director.

Competing with Director, another company had a strong product. Motion Works International was an early pioneer in the multimedia CD-ROM scene. Rumor has it, Motion Works was started by a 12 year old. Motion Works had been making software for use with the highly successful HyperCard software since 1988. In 1992 they released the successor to their ADDmotion software, a path based animation tool called PROmotion.

PROmotion was used with with many Multimedia titles, some in cooperation with the Corel Home series. In addition to commercial titles PROmotion was a great tool for the creation of animation clips and other marketing material. I came across some stand-alone marketing files for old scriptwriting software called ScriptWare. When I unarchived the HQX file and Installed the Demo, I was presented with a set of files with the .MW extension.

ls -l@
total 10232
-rw-r--r--@ 1 tyler  staff  1392 May  1 23:17 Read me first!
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	 452 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 begin_here.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	158901 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 characters.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	387029 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 cinovation.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	189509 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 cut paste.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	608405 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 formats.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	289698 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 modify formats.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	486730 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 notes.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	319250 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 overview.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	376854 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 scene shuffle.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	359746 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 script elements.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	279052 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 sw_menu.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	421836 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 title page.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	236614 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 transitions.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	471462 
-rw-r--r--@ 1 tyler  staff     0 May  1 23:17 try it.MW
	com.apple.FinderInfo	  32 
	com.apple.ResourceFork	622312 

getfileinfo sw_menu.MW 
file: "sw_menu.MW"
type: "APPL"
creator: "AMvw"

Looking at the files in the directory with their extended attributes I can see all the .MW files have no data fork (0 bytes), only a resource fork. This is common for any Application on the MacOS systems prior to MacOS X. At first the MW extension made me thing of MacWrite, but launching one of these MW files brought up an interactive menu. The type being APPL, which is Application.

What I thought would be a demo of the application Scriptware was actually interactive animations demonstrating the software. By dumping the resource fork of one of the MW files I found some information which helped me know what software created these interactive demos.

derez Scriptware\ Demo\ folder/sw_menu.MW

data 'vers' (1) {
	$"0103 8000 0000 0531 2E30 2E33 2941 4D20"            /* ..?....1.0.3)AM  */
	$"5669 6577 6572 2031 2E30 2E33 0DA9 2031"            /* Viewer 1.0.3.? 1 */
	$"3939 3320 4D6F 7469 6F6E 2057 6F72 6B73"            /* 993 Motion Works */
	$"2049 6E74 6C2E"                                     /*  Intl. */
};

data 'vers' (2) {
	$"0103 8000 0000 0531 2E30 2E33 1E50 6C61"            /* ..?....1.0.3.Pla */
	$"7962 6163 6B20 6279 204D 6F74 696F 6E20"            /* yback by Motion  */
	$"576F 726B 7320 496E 746C 2E"                        /* Works Intl. */
};

data 'STR#' (1250, "ADDmotion HC strings") {
	$"000A 1641 4444 6D6F 7469 6F6E 5F65 7870"            /* ...ADDmotion_exp */
	$"6F72 745F 6672 616D 650E 4144 446D 6F74"            /* ort_frame.ADDmot */
	$"696F 6E5F 696E 666F 1141 4444 6D6F 7469"            /* ion_info.ADDmoti */
	$"6F6E 5F73 7573 7065 6E64 1041 4444 6D6F"            /* on_suspend.ADDmo */
	$"7469 6F6E 5F72 6573 756D 650E 4144 446D"            /* tion_resume.ADDm */
	$"6F74 696F 6E5F 7175 6974 0E41 4444 6D6F"            /* otion_quit.ADDmo */
	$"7469 6F6E 5F70 6C61 790E 4144 446D 6F74"            /* tion_play.ADDmot */
	$"696F 6E5F 7374 6F70 0F41 4444 6D6F 7469"            /* ion_stop.ADDmoti */
	$"6F6E 5F70 6175 7365 0000"                           /* on_pause.. */
};

Makes sense, MW stood for “Motion Works”. ADDmotion was another software title developed by Motion Works, most will remember it as an add-on for Hypercard for adding animation to stacks. These MW files are created using PROmotion and exporting them as a stand-alone animation which includes the “AM Viewer” built in. A regular PROmotion file, however, did not include a viewer and requires the software in order to open and run.

-rwx------@ 1 tyler  staff      0 Apr 25 15:51 Example Animation
	com.apple.FinderInfo	   32 
	com.apple.ResourceFork	495272

The PROmotion file format also is Resource Fork only, making them difficult to manage outside of a Macintosh.

getfileinfo Example\ Animation
file: "Example Animation"
type: "ADDm"
creator: "ADDm"

The files do have a Type/Creator code of “ADDm”, but with no data fork, identification through standard means is not possible. They also do not have the “vers” string to help identify them within the Resource Fork. Since standard methods of identification are impossible, I hope in the future there will be more tools available to read the Type/Creator codes while on the Mac, or in a disk image, or within a container and return back the Software which created the file and the file type.

The products from Motion Works where significantly cheaper than animation tools such as Director, but were still pretty powerful for its day. I was surprised when I found the company didn’t last much longer than 1998 before disappearing. There are probably many stories like PROmotion, coming onto the scene with new and exciting features before thought impossible only to die out as other tools dominate the market.

If you are interested in looking at the files yourself, here is a link to some original files, and the same files encoded in MacBinary.

Writing Center

March 8, 2024 by Thor Leave a comment

In honor of #Marchintosh, I threatened in an earlier post to discuss The Writing Center, one of the many writing programs marketed by the Learning Company for the Mac. This one was developed by Datapak Software, Inc and I think they wanted to watch the world burn.

This format was different enough from the Student Writing Center and the “Ultimate Writing & Creativity Center” to need its own post. Moreover, I am pretty sure the developers of this software were actively trying to frustrate anyone trying to document the format. Let me explain.

In the early Macintosh world, very rarely were extensions used. Current systems use extensions to link the file to an application which can open the file. On the Mac, the system would use special attributes called Type / Creator codes. These codes were registered with Apple so they would be unique to a specific software and type of file. The codes used the FourCC system and unfortunately Apple never released a full list of codes used. Some folks over the years have tried to document as many as they can. Many used simple understandable codes, for example, A Microsoft Word document has a Type / Creator of W6BN / MSWD. The creator code of MSWD is very readable, and the type code W6BN is unique to a document from version 6 of Microsoft Word.

This Sample Report file from The Writing Center, when investigated with the ResEdit tool show interesting Type / Creator codes. If we look at the hexadecimals values for the codes. The first four bytes are the Type code and the second set of 4 bytes are the Creator code.

xattr -p com.apple.FinderInfo "Sample Report" 
0000   0A 57 50 31 0A 1A 57 50 01 00 00 00 00 00 00 00    .WP1..WP........

getfileinfo "Sample Report" 
file: "Sample Report"
type: "\nWP1"
creator: "\n\^ZWP"
attributes: avbstclInmedz
created: 10/13/1990 00:10:54
modified: 07/25/1991 11:58:20

The first thing to know is the encoding for all Type / Creator codes is MacRoman, so if we look up the hexadecimal code for “0A” we learn it is the character for a new Line Feed, why in the world would you use the line feed character? The developers must have had a sense of humor, or are psychopaths, and I’m leaning toward the latter. Trying to put this character into any sort of spreadsheet or text based document with other codes throws everything off! When I try and use a spreadsheet with a group of codes and then use a script to look them up on the command line I get crazy formatting. Not to mentioned the second character in the creator code is “1A” which is a substitute character.

This is just one example of crazy characters being used in Type / Creator codes. Stay tuned for more on these in future discussions.

Even though the Type / Creator codes are very useful in identification of this format, often times the Finder attribute is lost. This can happen if the file is moved off an HFS disk, usually a network or through the internet. Then all we have is the binary data fork and a file with no extension. So finding a signature to identify this format is useful.

hexdump -C "Sample Report" | head
00000000  00 12 cf fc 00 00 05 78  00 00 00 00 01 18 01 eb  |.......x........|
00000010  ff ff ff c4 ff ff ff c4  00 00 02 82 00 00 02 28  |...............(|
00000020  00 00 00 00 00 00 00 00  00 00 05 76 00 00 00 30  |...........v...0|
00000030  00 00 02 70 00 aa 00 00  05 76 00 00 00 30 00 00  |...p.....v...0..|
00000040  02 70 00 aa 00 00 00 00  00 00 00 00 00 00 00 00  |.p..............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 12  |................|
00000070  d1 2c 00 00 05 3f 00 00  00 00 01 00 06 47 65 6e  |.,...?.......Gen|
00000080  65 76 61 00 00 00 00 00  00 00 00 00 00 00 00 00  |eva.............|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 0c  |................|

hexdump -C WC-s01 | head        
00000000  03 df cd 9c 00 00 00 09  00 00 00 00 02 c3 02 64  |...............d|
00000010  00 00 00 00 00 00 00 00  00 00 00 59 00 00 02 64  |...........Y...d|
00000020  00 00 00 00 00 00 00 00  00 00 00 07 00 00 00 00  |................|
00000030  00 00 00 00 00 79 00 00  00 07 00 00 00 00 00 00  |.....y..........|
00000040  00 00 00 79 00 00 00 00  00 00 00 00 00 00 00 00  |...y............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 df  |................|
00000070  cd 78 00 00 00 00 00 00  00 00 01 00 06 47 65 6e  |.x...........Gen|
00000080  65 76 61 00 00 00 00 00  00 00 00 00 00 00 00 00  |eva.............|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 0c  |................|

Looking at the hexadecimal values of the header of a couple samples doesn’t initially look promising, the first few bytes are very different meaning there is no magic bytes at the beginning of the file. In fact the only thing the same is the mention of the Geneva font used in the document. Looking further into the files.

hexdump -C "Sample Report"       
00000000  00 12 cf fc 00 00 05 78  00 00 00 00 01 18 01 eb  |.......x........|
...
000000b0  00 00 00 00 00 00 00 02  84 28 ff ff 00 00 00 00  |.........(......|
000000c0  00 17 4e 26 00 12 d2 fc  00 00 00 00 00 12 d0 88  |..N&............|

hexdump -C WC-s01        
00000000  03 df cd 9c 00 00 00 09  00 00 00 00 02 c3 02 64  |...............d|
...
000000b0  00 00 00 00 00 00 00 02  84 28 ff ff 00 00 00 00  |.........(......|
000000c0  03 e3 a5 70 03 df cd 8c  00 00 00 00 03 df cd 64  |...p...........d|

hexdump -C Stationery 
00000000  00 12 d2 e8 00 00 00 02  00 00 00 00 01 17 01 ec  |................|
...
000000b0  00 00 00 00 00 00 00 02  84 20 ff ff 00 00 00 00  |......... ......|
000000c0  00 17 56 f8 00 12 cd f8  00 00 00 00 00 12 ce 40  |..V............@|

The only bytes I could find near the beginning that seemed semi consistent is the highlighted bytes above. I did however notice some consistent bytes at the end of each of the files.

hexdump -C "Sample Report" | tail                                                      
00007250  e5 00 02 e5 00 02 e5 00  02 e5 00 02 e5 00 02 e5  |................|
00007260  00 02 e5 00 02 e5 00 02  e5 00 02 e5 00 ff 00 07  |................|
00007270  00 00 00 05 04 31 2e 30  30 00 09 00 00 00 05 04  |.....1.00.......|
00007280  31 2e 30 30 00 08 00 00  00 05 04 31 2e 30 30 00  |1.00.......1.00.|
00007290  0a 00 00 00 05 04 31 2e  30 30 00 0b 00 00 00 02  |......1.00......|
000072a0  00 00 00 0c 00 00 00 10  00 00 00 00 00 00 00 00  |................|
000072b0  00 00 00 01 00 00 00 01  00 11 00 00 00 08 00 2b  |...............+|
000072c0  00 03 01 52 01 fd 00 13  00 00 00 02 00 00 7f ff  |...R............|
000072d0  00 00 00 00 00 00 72 dc  7f ff ff ff              |......r.....|

hexdump -C WC-s01 | tail                                                              
000003c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000003d0  01 00 00 80 0c 00 08 00  05 00 00 00 00 01 d2 03  |................|
000003e0  ee dc 3e 00 00 00 00 00  07 00 00 00 01 00 00 09  |..>.............|
000003f0  00 00 00 01 00 00 08 00  00 00 01 00 00 0a 00 00  |................|
00000400  00 01 00 00 0b 00 00 00  02 00 00 00 0c 00 00 00  |................|
00000410  10 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000420  01 00 11 00 00 00 08 00  2b 00 c7 02 fd 03 3a 00  |........+.....:.|
00000430  13 00 00 00 02 00 00 7f  ff 00 00 00 00 00 00 04  |................|
00000440  45 7f ff ff ff                                    |E....|

hexdump -C Stationery | tail
000039a0  00 02 e3 00 02 e3 00 02  e3 00 02 e3 00 02 e3 00  |................|
000039b0  02 e3 00 02 e3 00 02 e3  00 02 e3 00 02 e3 00 ff  |................|
000039c0  00 07 00 00 00 05 04 31  2e 30 30 00 09 00 00 00  |.......1.00.....|
000039d0  05 04 31 2e 30 30 00 08  00 00 00 05 04 31 2e 30  |..1.00.......1.0|
000039e0  30 00 0a 00 00 00 05 04  31 2e 30 30 00 0b 00 00  |0.......1.00....|
000039f0  00 02 00 00 00 0c 00 00  00 10 00 00 00 00 00 00  |................|
00003a00  00 00 00 00 00 01 00 00  00 01 00 11 00 00 00 08  |................|
00003a10  00 2b 00 03 01 51 01 fe  00 13 00 00 00 02 00 00  |.+...Q..........|
00003a20  7f ff 00 00 00 00 00 00  3a 2e 7f ff ff ff        |........:.....|

The four bytes at the end of each file by themselves would not be a good signature as there are many formats which end with a few “FF” sequences. But maybe combined with bytes near the beginning, a signature might be found. I added a couple samples to my Github page if you would like to take a look. In order to retain the extended attributes, I encoded the files as MacBinary.

lsar -L "Sample Report.bin"
Sample Report.bin: MacBinary
Sample Report: 
  Name:                    Sample Report
  Size:                    29.4 KB (29,404 bytes)
  Compressed size:         29.4 KB (29,440 bytes)
  Last modified:           Thursday, July 25, 1991 at 12:58:20 PM
  Created:                 Saturday, October 13, 1990 at 1:10:54 AM
  Mac OS type code:        ?WP1 (0x0a575031)
  Mac OS creator code:     ??WP (0x0a1a5750)
  Mac OS Finder flags:     0x0100
  Index in file:           0
  Length of embedded data: 29404
  Start of embedded data:  128
  Original archive entry:  Is an embedded MacBinary file: Yes

Compact Pro

February 16, 2024 by Thor 2 Comments

In the Classic Macintosh world back in the day it was important to use compression tools to keep files small and also allow you to send Macintosh files through the internet. Floppy disks could only hold a small amount of data so utilizing compression was a way to use the space effectively. I have already made posts on BINHEX and DiskDoubler which where also used for similar purposes. The most popular compression software for Macintosh is Stuffit, which used .SIT and .SEA extensions. One of the other often used tools was called Compact Pro.

Compact Pro, originally know as Compactor, developed by Bill Goodman in the early 1990’s and was quite popular. It was generally faster in its ability to compress and decompress files on the Macintosh. By 1995 the last version was released and by 2002 the software was officially discontinued.

Also, Macintosh files often contain a Resource Fork to go along with the data. Archiving files within a Compact Pro archive could contain both forks along with creation, modification dates and the finder Type/Creator codes. Then an archive could be transferred through the internet or on a non Macintosh file system without loosing these key bits of information.

You can see from the image below, the compression of a PICT file retained the resource fork and finder data with an impressive 60% savings in size.

Compact Pro could also segment an archive into multiple parts. This was advantageous when needing to copy a larger file on to a set of floppy disks, or for transferring smaller files through the internet and combined later. Segments would be extracted by opening the final segment.

The other nifty feature of Compact Pro is it could create a Self-Extracting Archive. Archiving as an SEA, would compress the file into an archive, but contained within an application which could extract the archive without the use of the the full Compact Pro application. This was used mainly for use on distributed Macintosh file system disks as the application could only be run on a Mac OS system.

Let’s look at the actual Compact Pro file format.

hexdump -C CompactProTest.cpt | head
00000000  01 01 6f 07 00 00 00 cb  80 35 04 56 00 60 50 50  |..o......5.V.`PP|
00000010  00 50 50 00 60 05 60 50  00 00 00 00 00 00 00 00  |.PP.`.`P........|
00000020  00 00 60 00 00 00 00 00  00 00 00 00 00 00 00 00  |..`.............|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 30  |...............0|
00000040  00 00 04 60 00 05 00 06  00 55 40 00 00 00 00 00  |...`.....U@.....|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 60 00 00 00  |............`...|
00000070  00 00 00 00 00 40 00 00  00 00 00 00 00 00 00 00  |.....@..........|
00000080  00 00 00 00 00 00 00 00  05 08 00 01 20 00 00 00  |............ ...|
00000090  00 20 01 10 88 c1 04 f6  05 41 3e 47 56 e4 09 5f  |. .......A>GV.._|

hexdump -C CP-s01.cpt | head    
00000000  01 01 90 69 00 00 10 55  80 46 78 67 77 67 78 67  |...i...U.Fxgwgxg|
00000010  86 88 09 89 9a 70 8b 90  ba 97 0a a7 90 87 a6 bb  |.....p..........|
00000020  90 8a a0 90 ab b7 aa a0  a0 80 a8 a0 98 89 00 9a  |................|
00000030  99 80 98 99 69 a9 60 0a  79 ab 86 0a b7 98 a7 90  |....i.`.y.......|
00000040  98 a0 97 7a 90 00 09 00  07 77 80 00 aa 9b 00 ba  |...z.....w......|
00000050  99 a0 90 00 08 08 a0 8a  08 a0 00 00 b9 b0 09 7a  |...............z|
00000060  08 0a aa 90 0a aa 00 00  98 60 90 b9 9b 9a 9a 57  |.........`.....W|
00000070  a8 88 bb aa aa 00 00 77  89 7a 09 b9 89 79 9b 78  |.......w.z...y.x|
00000080  86 80 8a 96 65 55 56 66  65 17 00 02 24 35 46 47  |....eUVfe...$5FG|
00000090  57 67 67 78 88 8a 70 80  80 90 00 a0 90 a0 00 00  |Wggx..p.........|

The file format is not recognized by PRONOM, and as you can see from the headers above, identification is not easy as there are no magic bytes. Using Unarchiver they identify as Compact Pro.

lsar CP-s01.cpt 
CP-s01.cpt: Compact Pro
CP.PICT

The only bytes which seem to be consistent is the first two, but “01 01” is not a signature which is unique to Compact Pro. The Unarchiver uses a more complicated calculation of file size and the CRC for identification, from what I can tell.

hexdump -C CP-s01.sea | head
00000000  01 01 8a 89 00 00 10 55  80 46 78 67 77 67 78 67  |.......U.Fxgwgxg|
00000010  86 88 09 89 9a 70 8b 90  ba 97 0a a7 90 87 a6 bb  |.....p..........|
00000020  90 8a a0 90 ab b7 aa a0  a0 80 a8 a0 98 89 00 9a  |................|
00000030  99 80 98 99 69 a9 60 0a  79 ab 86 0a b7 98 a7 90  |....i.`.y.......|
00000040  98 a0 97 7a 90 00 09 00  07 77 80 00 aa 9b 00 ba  |...z.....w......|
00000050  99 a0 90 00 08 08 a0 8a  08 a0 00 00 b9 b0 09 7a  |...............z|
00000060  08 0a aa 90 0a aa 00 00  98 60 90 b9 9b 9a 9a 57  |.........`.....W|
00000070  a8 88 bb aa aa 00 00 77  89 7a 09 b9 89 79 9b 78  |.......w.z...y.x|
00000080  86 80 8a 96 65 55 56 66  65 17 00 02 24 35 46 47  |....eUVfe...$5FG|
00000090  57 67 67 78 88 8a 70 80  80 90 00 a0 90 a0 00 00  |Wggx..p.........|

The self extracting archive has the same basic structure. I have also noticed on all the archive samples I have, the byte at offset 8 is always “80”. This could be significant.

Another thing to note, when looking at a segmented archive, the first two bytes are in sequence, 0101 for the first, 0102 for the second and so on.

CompactPro could use some further investigation. You can find quite a few on site such as: https://websites.umich.edu/~archive/mac

For now, it would be good to add the CPT extension to PRONOM with the name CompactPro Archive.

Apple Mail

October 27, 2023 by Thor Leave a comment

There really is no “Macintosh Format”, but there sure are a lot of formats you only find on the MacOS. From Resource Forks and iWork formats to unique sound formats, MacOS has them all! Majority of cross-platform software vendors have done a much better job in recent years in making their file formats the same across platforms, but for Apple, they love to make things unique, just for their platform.

Take EMLX for example. Seems to be a trend to add “X” to the end of an older format to breath new life into it. The EML format, or Electronic Mail, has existed for a few decades now, but in 2005 Apple updated their Apple Mail application to use a new format, EMLX.

As far as I know, Apple hasn’t released any documentation on the EMLX format, but many folks out there have asked the question and have been able to “reverse engineer” the format. Lets take a look.

An EMLX file consists of three parts:

bytecount on first line;
email content in MIME format (headers, body, attachments);
Apple property list (plist) with metadata.

The bytecount is a variable number which consists of the total bytes starting from the start of the MIME format, including HTML, to the start of the XML property list. Lets look at a simple EMLX.

The byte count is on line 1 with the MIME email (EML) taking up the 556 bytes, then the XML plist at the end. You may ask, what is a plist? Well, it is another Apple (originally NextStep) invention which is embedded throughout the MacOS operating system. A Plist is usually an XML with keys but can also be in a binary format. The Plist can contain properties of the email within Apple Mail like special color flags, tagged as junk, date received and last reviewed.

If you do happen across an EMLX file or group of them, there are a few tools you can use to convert them to a plain old EML. There are python libraries or many other tools to do the job.

But first we need to be sure of identification beyond the extension. Adding this file format to PRONOM would help in identification for preservation purposes. If ran through PRONOM today we get:

filename : '9.emlx'
filesize : 18582
modified : 2023-10-26T22:16:25-06:00
errors   : 
matches  :
  - ns      : 'pronom'
    id      : 'fmt/950'
    format  : 'MIME Email'
    version : '1.0'
    mime    : 'message/rfc822'
    class   : 'Text (Structured)'
    basis   : 'byte match at [[31 17] [599 4] [339 6] [426 6] [90 14]]'
    warning : 'extension mismatch'

Because the format has a EML plain text format within its structure, it is assumed to be an EML file. While technically accurate, Identifying as a unique EMLX format would be beneficial in a preservation system so you can properly assign risk and choose the right tool to parse or migrate.

In looking at the three parts of an EMLX format, we know the EML file is not a good way to show the difference as they are the same structure. The byte count on the first line is variable, so there is no static byte sequence to use for identification. That leaves the Plist section at the end to distinguish the difference.

The PRONOM entry for a Plist looks for the typical XML strings present in most XML files, but then uses the root element “<plist version=”1.0″>” for identification. We could combine the existing EML signature and the Plist signature to identify an EMLX, or just take the existing EML signature and put in a small byte sequence for the closing of the </plist> tag near the EOF? There would be a need for a priority over EML, both would essentially accomplish the same thing.

Take a look at latter idea on my GitHub page and tell me which makes the most sense.

No bad deed….

October 13, 2023 by Thor Leave a comment

I had access to my first Macintosh computer around 1987. My father brought it home and I spent hours on it playing games and occasionally writing reports for school. The Macintosh Plus computer had one floppy drive and no hard drive. I remember playing the game Orbiter which had two floppy disks and right in the middle of game play it would pause and ask me to insert disk 2, then quickly ask for disk 1 again. The struggle was real. I spent years using many different Macintosh computers and now own more than I wish to admit. I’m preserving them!

The wild world of digital preservation has been a little lacking on the Macintosh side of things as I have come to realize. There still not a great way to manage Resource Forks in many preservation systems and the identification tools are mainly focused on the data bytetreams and not any system specific attributes Macintosh used often.

The PRONOM registry has either referenced early Macintosh specific formats or missed them entirely so I have been slowly working on a few to close that gap.

Interestingly enough, many Microsoft programs initially made their GUI debuts on the early Macintosh before making their way to Windows. Excel is one I am working on, as Version 1 is not identifiable in PRONOM, it was Macintosh only at the time.

Another is PowerPoint, I recently submitted two new signatures to PRONOM.

fmt/1747: Microsoft PowerPoint Presentation v2.x. Full entry added.
fmt/1748: Microsoft PowerPoint Presentation v3.x. Full entry added.
fmt/1866: Microsoft Powerpoint for Macintosh v.2. Full entry added.
fmt/1867: Microsoft Powerpoint for Macintosh v.3. Full entry added.

PowerPoint was initially released in 1987 on the Macintosh platform. It was developed by a company called ForeThought. Version 1.0 on the Macintosh was under this name, until it was bought by Microsoft only three months after being released. The history of PowerPoint can be discovered at Robert Gaskins, one of the original developers, website and book he wrote. The available information provided by Microsoft is only for the OLE format, covering versions 4.0 until 2003.

So, lets take a look at the Powerpoint original file format, before OLE.

   Type/Creator      RF      DF  Date         Filename
f  SLDS/PPNT         0       932 Oct 10 19:10 PowerPoint-v1

Luckily the early PowerPoint files did not have a Resource Fork. The Data Fork, if you haven’t noticed, has an interesting set of hex values at the beginning of the file. 0BADDEED is the first 4 bytes. If we look at a PowerPoint version 2 file from Windows.

The file format is the same, but because of the weird world of endianness, the first few bytes are in reverse order, EDDEAD0B.

Obviously we need to discuss this magic number and the meaning behind “Bad Deed”. This question was asked previously by the digital preservation community. I have a previous blog post about the use of words for the magic number CAFEBEEF as it was used with with JAVA class files and Express Publisher in the 1990’s. BADDEED looks like another clever use of the hex values that formed words. But was there a story behind the words? Joe Carrano asked if this string might be hexspeak. I wanted to know more so I asked some one who might know.

Robert Gaskins was kind enough to chat with me for a bit about the early days of PowerPoint.

I had a theory on the possible meaning behind BADDEED, so I asked him what the feeling was like between Apple and Microsoft at the time. I had heard for years that PowerPoint was originally created for the Macintosh, but Robert informed me:

In fact, PowerPoint was designed first for Microsoft Windows,

and its first spec shows that: “All the screen shots, menus, and

dialogs were set up to look like Microsoft Windows, not like

Macintosh.” (Gaskins, Sweating Bullets, p. 92) You can see that

spec here.

A year later, we concluded that we would be forced to ship

on Mac first, although we still thought that Windows was the

big opportunity and thought that Mac was risky. “We just didn’t think

we could successfully ship a product for Windows, yet, though we planned

to later. (Gaskins, Sweating Bullets, p. 105) The considerations are

summarized in my June 1986 product marketing document.

Of course, we turned out to have been right all along. PowerPoint on

Mac was much loved, but sales remained poor because Mac sales were

so poor. It was only after we shipped on Windows that PowerPoint gained

the dominant market share which has characterized it ever since, and

Windows PPT outsold Mac PPT very quickly. (Gaskins, Sweating Bullets, p. 403)

So my original thought was that there was some bad feelings around this Apple, Microsoft battle which has been the sentiment for quite some time. So when I asked if any of that influenced the use of BADDEED, I was told:

So, far from being disgruntled by expanding PowerPoint to Windows,

that had been our goal all along, and its achievement was the most

important success we had.

I judge that you are fully aware of all that, and that

your question is more, “was there any bad deed signified

by the Mac hex value chosen?” No, it was just the poverty

of choice when you only have six letters.

So there you have it. The use of the hex values 0x0BADDEED, was simply chosen from a limited set of values when looking at words hexadecimal could spell. I guess I should never let the truth get in the way of a good story.

I continued to have a wonderful conversation with Robert and also asked him for some details on the rest of the PowerPoint file format. I was hoping there might be some documentation out there explaining the early format before Microsoft took over. Robert said:

I don’t know of any such documentation apart from the official

Microsoft support files available online. I don’t have any such

information. I know that Dennis Austin deposited some of our

working files at the Computer History Museum (not online):

https://archive.computerhistory.org/resources/access/text/finding-aids/102733943-Austin/102733943-Austin.pdf

and it’s likely that some information is there–if nothing

else, it claims to contain a source code listing for PPT 1.0

which would contain the code to read the file format.

So there might be some information in at the Computer History Museum worth looking into.

As far as I could tell from the available online information, there is a few differences between Version 1.0 and Version 2.0, the biggest being the fact that 1.0 did not have an option to print in color, amount a few other minor things. Here is a screenshot of a page from the Microsoft PowerPoint 2.0 documentation on archive.org.

I suppose with the signature additions of the Macintosh and Windows versions 2.0 and 3.0 of the PowerPoint file format in PRONOM, that should cover most needs. Currently my PowerPoint 1.0 files identify at 2.0 files, so I may need to have them adjust the PUID to include both versions 1.0 and 2.0 as they are so similar. If I am able to find a difference or get my hands on the original source code I may find a better solution.