Most everyone has heard of Microsoft Office, the suite of applications used by millions everyday. Less people know about Microsoft Works, which was a lower cost alternative, but was quite popular as a home office suite of applications. One tool which often came with the Works suite was a digital image tool called Picture It!
Picture It! was a photo editing tool first released by Microsoft in 1996 geared to making photo editing easy and affordable.
Picture It! used a wizard type interface which walked you through acquiring an image and adding to it. One of the key features of the software was the ability to “stack” objects like layers. Because of this feature a new file format was used to save this information to disk. Meet the Microsoft Image (Picture) Extension format, commonly known as the MIX file format. It is very similar to the FlashPix image format, which was supposed to be an image file format to solve many delivery issues, but didn’t seem to gain hold despite being created by Kodak, HP, and others. In fact many of the MIX files I found on Microsoft disks are actually FlashPix files.
The MIX extension was also used by another Microsoft program, PhotoDraw, which causes confusion as they were similar, but PhotoDraw has some added features which may not be compatible with Picture It!. Both formats are based on the Microsoft Compound Object (OLE) container, and have a similar structure. Let’s take a look at a MIX file from Picture It! version 1.
7z l PictureIt1-s02.mix -- Path = PictureIt1-s02.mix Type = Compound Physical Size = 48128 Extension = compound Cluster Size = 512 Sector Size = 64 Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ ..... 328 384 [5]Data Object 000001 ..... 396 448 [5]Transform 000004 ..... 872 896 [5]Operation 000001 ..... 320 320 [1]CompObj ..... 292 320 [5]Global Info ..... 872 896 [5]Operation 000002 ..... 144 192 [5]Operation 000003 ..... 684 704 [5]Transform 000008 ..... 1028 1088 [5]Transform 000009 ..... 328 384 [5]Data Object 000009 ..... 324 384 [5]Data Object 000005 2023-12-27 11:04:39 D.... Data Object Store 000001 ..... 328 384 [5]Data Object 000010 ..... 20932 20992 [5]SummaryInformation ..... 200 256 [5]Microsoft Embedding Info 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0001 ..... 1400 1408 Data Object Store 000001/[5]Image Contents ..... 230 256 Data Object Store 000001/[1]CompObj 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0000 ..... 28 64 Data Object Store 000001/Resolution 0000/Subimage 0000 Data ..... 80 128 Data Object Store 000001/Resolution 0000/Subimage 0000 Header 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0003 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0002 ..... 28 64 Data Object Store 000001/Resolution 0002/Subimage 0000 Data ..... 208 256 Data Object Store 000001/Resolution 0002/Subimage 0000 Header 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0005 2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0004 ..... 28 64 Data Object Store 000001/Resolution 0004/Subimage 0000 Data ..... 1792 1792 Data Object Store 000001/Resolution 0004/Subimage 0000 Header ..... 124 128 Data Object Store 000001/[5]SummaryInformation ..... 28 64 Data Object Store 000001/Resolution 0005/Subimage 0000 Data ..... 6976 7168 Data Object Store 000001/Resolution 0005/Subimage 0000 Header ..... 28 64 Data Object Store 000001/Resolution 0003/Subimage 0000 Data ..... 544 576 Data Object Store 000001/Resolution 0003/Subimage 0000 Header ..... 28 64 Data Object Store 000001/Resolution 0001/Subimage 0000 Data ..... 128 128 Data Object Store 000001/Resolution 0001/Subimage 0000 Header ------------------- ----- ------------ ------------ ------------------------ 2023-12-27 11:04:39 38698 39872 29 files, 7 folders
This is a simple MIX file with one line of text, but contains a lot of content inside the OLE container. If I try and use the PRONOM registry to identify the file, I get:
sf PictureIt1-s02.mix --- siegfried : 1.11.0 scandate : 2023-12-27T11:06:32-07:00 signature : default.sig created : 2023-12-17T15:54:41+01:00 identifiers : - name : 'pronom' details : 'DROID_SignatureFile_V116.xml; container-signature-20231127.xml' --- filename : 'PictureIt1-s02.mix' filesize : 48128 modified : 2023-12-27T11:04:40-07:00 errors : matches : - ns : 'pronom' id : 'fmt/111' format : 'OLE2 Compound Document Format' version : mime : class : 'Text (Structured)' basis : 'byte match at 0, 30' warning :
Hmm, we know it is an OLE compound document, but it should identify as a Picture It! file as PRONOM has defined a PUID for the format. fmt/936 has been defined as “Microsoft Picture It! Image File 1”. So I am not sure why this file from version 1 is not identifying correctly. Let’s take a look. The PRONOM container signature for fmt/936 is looking for this:
<ContainerSignature Id="17015" ContainerType="OLE2"> <Description>Microsoft Picture It! Image File</Description> <Files> <File> <Path>CompObj</Path> <BinarySignatures> <InternalSignatureCollection> <InternalSignature ID="17015"> <ByteSequence Reference="BOFoffset"> <SubSequence Position="1" SubSeqMinOffset="32" SubSeqMaxOffset="32"> <Sequence>'Microsoft Picture It! version 1 Picture'</Sequence> </SubSequence> </ByteSequence> </InternalSignature> </InternalSignatureCollection> </BinarySignatures> </File> </Files> </ContainerSignature>
The container signature is looking into the OLE container for the “CompObj” file (which seems to be required), then looks for the string “Microsoft Picture It! version 1 Picture” starting at the 32nd byte. That is pretty specific. The sample file I am using as an example has the following string of bytes.
hexdump -C PictureIt1-s02/\[1\]CompObj 00000000 01 00 fe ff 03 0a 00 00 ff ff ff ff 00 68 61 56 |.............haV| 00000010 54 c1 ce 11 85 53 00 aa 00 a1 f9 5b 1e 00 00 00 |T....S.....[....| 00000020 4d 69 63 72 6f 73 6f 66 74 20 50 69 63 74 75 72 |Microsoft Pictur| 00000030 65 20 49 74 21 20 50 69 63 74 75 72 65 00 27 00 |e It! Picture.'.| 00000040 00 00 7b 35 36 36 31 36 38 30 30 2d 43 31 35 34 |..{56616800-C154| 00000050 2d 31 31 43 45 2d 38 35 35 33 2d 30 30 41 41 30 |-11CE-8553-00AA0| 00000060 30 41 31 46 39 35 42 7d 00 13 00 00 00 50 69 63 |0A1F95B}.....Pic| 00000070 74 75 72 65 49 74 21 2e 50 69 63 74 75 72 65 00 |tureIt!.Picture.|
Ok, so this sample has a similar string but is missing the “version 1” text. It seems the samples used to created the PRONOM signature was working off samples which included the version 1 in the header of CompObj. Maybe when Microsoft learned they would be making a version 2, they decided a version number should be included going forward. Let’s take a look a file from version 2 to compare:
hexdump -C PictureIt2-s01/\[1\]CompObj 00000000 01 00 fe ff 03 0a 00 00 ff ff ff ff 50 28 72 2d |............P(r-| 00000010 4b 8c d0 11 a9 6f 00 a0 c9 05 41 0d 28 00 00 00 |K....o....A.(...| 00000020 4d 69 63 72 6f 73 6f 66 74 20 50 69 63 74 75 72 |Microsoft Pictur| 00000030 65 20 49 74 21 20 76 65 72 73 69 6f 6e 20 32 20 |e It! version 2 | 00000040 50 69 63 74 75 72 65 00 27 00 00 00 7b 32 44 37 |Picture.'...{2D7| 00000050 32 32 38 35 30 2d 38 43 34 42 2d 31 31 44 30 2d |22850-8C4B-11D0-| 00000060 41 39 36 46 2d 30 30 41 30 43 39 30 35 34 31 30 |A96F-00A0C905410| 00000070 44 7d 00 f4 39 b2 71 50 00 00 00 4d 00 69 00 63 |D}..9.qP...M.i.c|
Ok, so it looks like they did update the version string for version 2. This file also does not identify correctly. A quick look at the wikipedia page for Microsoft Picture It! tells us they continued to release the software until version 10. Is there a different string for each version?
Diving into this and gathering many samples has brought a lot of variants to surface. Let’s see if we can list all the CompObj header variants.
Version 1 samples: Picture It! Picture'{56616800-C154-11CE-8553-00AA00A1F95B} Microsoft Picture It! Picture'{56616800-C154-11CE-8553-00AA00A1F95B} Microsoft Picture It! version 1 Picture'{56616800-C154-11CE-8553-00AA00A1F95B} Picture It! Collage'{56616800-C154-11CE-8553-00AA00A1F95B} Version 2 samples: Microsoft Picture It! version 2 Picture'{2D722850-8C4B-11D0-A96F-00A0C905410D} Version 3 samples: Microsoft Picture It! version 3 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D} Version 4 samples: Microsoft Picture It! version 4 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D} PhotoDraw version 1 samples: Microsoft PhotoDraw version 1 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D} PhotoDraw version 2 samples: Microsoft PhotoDraw version 2 Picture'{18B8D021-B4FD-11D0-A97E-00A0C905410D} FlashPix samples: FlashPix Object({56616000-C154-11CE-8553-00AA00A1F95B} FlashPix Object({56616800-C154-11CE-8553-00AA00A1F95B} Picture It! FlashPix'{56616700-C154-11CE-8553-00AA00A1F95B} LPI FlashPix'{56616700-c154-11ce-8553-00aa00a1f95b} FlashPix_Object'{56616700-C154-11CE-8553-00AA00A1F95B} '{56616700-C154-11CE-8553-00AA00A1F95B} Picture It!'{56616700-c154-11ce-8553-00aa00a1f95b} Flashpix Toolkit Application'{56616700-c154-11ce-0000-000000000000}
Ok, there is a lot to discuss here. First of all, it seems MIX was only used in Picture It! until version 5 (2001), then the Picture It! software used a new format, PNG Plus to store the layered stacks. More on that in a future post! Although some later versions seems to be able to open the older MIX format. Version 4 of the MIX format seems to be the last as the 2001 software had only version 4 files on it. Probably safe to say only the 4 versions are needed for identification.
You may notice the additional unique identifier I included in each format. This is called a Class ID for the OLE format, which A LOT of formats use. Each “format” has a unique ID associated with it to help distinguish it from other formats. This Unique ID could possibly be a better solution for identification. It does cross over with the PhotoDraw format, but the FlashPix format seems to have a unique ID. With all the variations in the version 1 strings, the ID remains the same. For version 3 and 4 the ID is the same, which could mean they are interchangeable. It is also the same as PhotoDraw version 1. Not to complicate things.
So it seems in order to get proper identification of these similar formats we need to:
- Clean up version 1 identification for fmt/936
- Add a signature for 2, 3, and 4
- Add a version 2 signature for the PhotoDraw format
- Add some additional signature variations for the FlashPix format.
The Class ID’s could be used to distinguish different versions and formats, but many of the ID’s are identical, this could mean they are the same format. But for now we can just add the additional variation strings and it should identify everything for now. The FlashPix format needs more research as there is so many different variations and it’s so close to the MIX format. Take a look at my GitHub submission, maybe you have some additional variations to add?