Writing Center

In honor of #Marchintosh, I threatened in an earlier post to discuss The Writing Center, one of the many writing programs marketed by the Learning Company for the Mac. This one was developed by Datapak Software, Inc and I think they wanted to watch the world burn.

This format was different enough from the Student Writing Center and the “Ultimate Writing & Creativity Center” to need its own post. Moreover, I am pretty sure the developers of this software were actively trying to frustrate anyone trying to document the format. Let me explain.

In the early Macintosh world, very rarely were extensions used. Current systems use extensions to link the file to an application which can open the file. On the Mac, the system would use special attributes called Type / Creator codes. These codes were registered with Apple so they would be unique to a specific software and type of file. The codes used the FourCC system and unfortunately Apple never released a full list of codes used. Some folks over the years have tried to document as many as they can. Many used simple understandable codes, for example, A Microsoft Word document has a Type / Creator of W6BN / MSWD. The creator code of MSWD is very readable, and the type code W6BN is unique to a document from version 6 of Microsoft Word.

This Sample Report file from The Writing Center, when investigated with the ResEdit tool show interesting Type / Creator codes. If we look at the hexadecimals values for the codes. The first four bytes are the Type code and the second set of 4 bytes are the Creator code.

xattr -p com.apple.FinderInfo "Sample Report" 
0000   0A 57 50 31 0A 1A 57 50 01 00 00 00 00 00 00 00    .WP1..WP........

getfileinfo "Sample Report" 
file: "Sample Report"
type: "\nWP1"
creator: "\n\^ZWP"
attributes: avbstclInmedz
created: 10/13/1990 00:10:54
modified: 07/25/1991 11:58:20

The first thing to know is the encoding for all Type / Creator codes is MacRoman, so if we look up the hexadecimal code for “0A” we learn it is the character for a new Line Feed, why in the world would you use the line feed character? The developers must have had a sense of humor, or are psychopaths, and I’m leaning toward the latter. Trying to put this character into any sort of spreadsheet or text based document with other codes throws everything off! When I try and use a spreadsheet with a group of codes and then use a script to look them up on the command line I get crazy formatting. Not to mentioned the second character in the creator code is “1A” which is a substitute character.

This is just one example of crazy characters being used in Type / Creator codes. Stay tuned for more on these in future discussions.

Even though the Type / Creator codes are very useful in identification of this format, often times the Finder attribute is lost. This can happen if the file is moved off an HFS disk, usually a network or through the internet. Then all we have is the binary data fork and a file with no extension. So finding a signature to identify this format is useful.

hexdump -C "Sample Report" | head
00000000  00 12 cf fc 00 00 05 78  00 00 00 00 01 18 01 eb  |.......x........|
00000010  ff ff ff c4 ff ff ff c4  00 00 02 82 00 00 02 28  |...............(|
00000020  00 00 00 00 00 00 00 00  00 00 05 76 00 00 00 30  |...........v...0|
00000030  00 00 02 70 00 aa 00 00  05 76 00 00 00 30 00 00  |...p.....v...0..|
00000040  02 70 00 aa 00 00 00 00  00 00 00 00 00 00 00 00  |.p..............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 12  |................|
00000070  d1 2c 00 00 05 3f 00 00  00 00 01 00 06 47 65 6e  |.,...?.......Gen|
00000080  65 76 61 00 00 00 00 00  00 00 00 00 00 00 00 00  |eva.............|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 0c  |................|

hexdump -C WC-s01 | head        
00000000  03 df cd 9c 00 00 00 09  00 00 00 00 02 c3 02 64  |...............d|
00000010  00 00 00 00 00 00 00 00  00 00 00 59 00 00 02 64  |...........Y...d|
00000020  00 00 00 00 00 00 00 00  00 00 00 07 00 00 00 00  |................|
00000030  00 00 00 00 00 79 00 00  00 07 00 00 00 00 00 00  |.....y..........|
00000040  00 00 00 79 00 00 00 00  00 00 00 00 00 00 00 00  |...y............|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 df  |................|
00000070  cd 78 00 00 00 00 00 00  00 00 01 00 06 47 65 6e  |.x...........Gen|
00000080  65 76 61 00 00 00 00 00  00 00 00 00 00 00 00 00  |eva.............|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 0c  |................|

Looking at the hexadecimal values of the header of a couple samples doesn’t initially look promising, the first few bytes are very different meaning there is no magic bytes at the beginning of the file. In fact the only thing the same is the mention of the Geneva font used in the document. Looking further into the files.

hexdump -C "Sample Report"       
00000000  00 12 cf fc 00 00 05 78  00 00 00 00 01 18 01 eb  |.......x........|
...
000000b0  00 00 00 00 00 00 00 02  84 28 ff ff 00 00 00 00  |.........(......|
000000c0  00 17 4e 26 00 12 d2 fc  00 00 00 00 00 12 d0 88  |..N&............|

hexdump -C WC-s01        
00000000  03 df cd 9c 00 00 00 09  00 00 00 00 02 c3 02 64  |...............d|
...
000000b0  00 00 00 00 00 00 00 02  84 28 ff ff 00 00 00 00  |.........(......|
000000c0  03 e3 a5 70 03 df cd 8c  00 00 00 00 03 df cd 64  |...p...........d|

hexdump -C Stationery 
00000000  00 12 d2 e8 00 00 00 02  00 00 00 00 01 17 01 ec  |................|
...
000000b0  00 00 00 00 00 00 00 02  84 20 ff ff 00 00 00 00  |......... ......|
000000c0  00 17 56 f8 00 12 cd f8  00 00 00 00 00 12 ce 40  |..V............@|

The only bytes I could find near the beginning that seemed semi consistent is the highlighted bytes above. I did however notice some consistent bytes at the end of each of the files.

hexdump -C "Sample Report" | tail                                                      
00007250  e5 00 02 e5 00 02 e5 00  02 e5 00 02 e5 00 02 e5  |................|
00007260  00 02 e5 00 02 e5 00 02  e5 00 02 e5 00 ff 00 07  |................|
00007270  00 00 00 05 04 31 2e 30  30 00 09 00 00 00 05 04  |.....1.00.......|
00007280  31 2e 30 30 00 08 00 00  00 05 04 31 2e 30 30 00  |1.00.......1.00.|
00007290  0a 00 00 00 05 04 31 2e  30 30 00 0b 00 00 00 02  |......1.00......|
000072a0  00 00 00 0c 00 00 00 10  00 00 00 00 00 00 00 00  |................|
000072b0  00 00 00 01 00 00 00 01  00 11 00 00 00 08 00 2b  |...............+|
000072c0  00 03 01 52 01 fd 00 13  00 00 00 02 00 00 7f ff  |...R............|
000072d0  00 00 00 00 00 00 72 dc  7f ff ff ff              |......r.....|

hexdump -C WC-s01 | tail                                                              
000003c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000003d0  01 00 00 80 0c 00 08 00  05 00 00 00 00 01 d2 03  |................|
000003e0  ee dc 3e 00 00 00 00 00  07 00 00 00 01 00 00 09  |..>.............|
000003f0  00 00 00 01 00 00 08 00  00 00 01 00 00 0a 00 00  |................|
00000400  00 01 00 00 0b 00 00 00  02 00 00 00 0c 00 00 00  |................|
00000410  10 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000420  01 00 11 00 00 00 08 00  2b 00 c7 02 fd 03 3a 00  |........+.....:.|
00000430  13 00 00 00 02 00 00 7f  ff 00 00 00 00 00 00 04  |................|
00000440  45 7f ff ff ff                                    |E....|

hexdump -C Stationery | tail
000039a0  00 02 e3 00 02 e3 00 02  e3 00 02 e3 00 02 e3 00  |................|
000039b0  02 e3 00 02 e3 00 02 e3  00 02 e3 00 02 e3 00 ff  |................|
000039c0  00 07 00 00 00 05 04 31  2e 30 30 00 09 00 00 00  |.......1.00.....|
000039d0  05 04 31 2e 30 30 00 08  00 00 00 05 04 31 2e 30  |..1.00.......1.0|
000039e0  30 00 0a 00 00 00 05 04  31 2e 30 30 00 0b 00 00  |0.......1.00....|
000039f0  00 02 00 00 00 0c 00 00  00 10 00 00 00 00 00 00  |................|
00003a00  00 00 00 00 00 01 00 00  00 01 00 11 00 00 00 08  |................|
00003a10  00 2b 00 03 01 51 01 fe  00 13 00 00 00 02 00 00  |.+...Q..........|
00003a20  7f ff 00 00 00 00 00 00  3a 2e 7f ff ff ff        |........:.....|

The four bytes at the end of each file by themselves would not be a good signature as there are many formats which end with a few “FF” sequences. But maybe combined with bytes near the beginning, a signature might be found. I added a couple samples to my Github page if you would like to take a look. In order to retain the extended attributes, I encoded the files as MacBinary.

lsar -L "Sample Report.bin"
Sample Report.bin: MacBinary
Sample Report: 
  Name:                    Sample Report
  Size:                    29.4 KB (29,404 bytes)
  Compressed size:         29.4 KB (29,440 bytes)
  Last modified:           Thursday, July 25, 1991 at 12:58:20 PM
  Created:                 Saturday, October 13, 1990 at 1:10:54 AM
  Mac OS type code:        ?WP1 (0x0a575031)
  Mac OS creator code:     ??WP (0x0a1a5750)
  Mac OS Finder flags:     0x0100
  Index in file:           0
  Length of embedded data: 29404
  Start of embedded data:  128
  Original archive entry:  Is an embedded MacBinary file: Yes