Working in preservation and archiving for the last few years has caused me to change a habit most people use everyday. The double-click. I am usually opening a file in a hex editor or control clicking on a file to open it in a different software application than is default. Maybe it’s just me, but having control over opening a file is essential. The thought of double-clicking on a file and the uncertainty of what is actually happening scares me a little.
Of course opening an application executable requires a double-click or a right-click/open process and from there you can open the file of your choosing. Executables are run-able files because they have the required pieces for the operating system and cpu to interpret and well; run. We need executables in order to make sense of the files we preserve. Without something to interpret our the data in our files they are just a bunch of one’s & zero’s.
Take a PDF for example. By itself, it is hard to make sense of the file. You need Acrobat Reader, or any number of other executable software programs to open and render the PDF.
But what if you could take a file and wrap it in an executable so it is all self contained, the file format and an executable in one file! No separate software needed! On the surface this seems like a great idea, which is why a few software companies had this as an option. An early competitor of PDF, Common Ground had the option to embed the DP file into a self contained viewer. Many archive software tools have the ability to make “self-extracting” executables as well. One obvious downside is being unable to execute on a different platform or a later operating system. But at the time they were very convenient.
One software in particular added the option to export a few different formats into a special wrapper making them viewable on any Windows machine.
New Soft Technology Corporation Presto! PageManager is document management software which can view many different file types. The software helps manage document and photo scanning and keep everything organized. The software often came bundled with home consumer scanners, such as the UMAX Astra scanner I bought years ago. With the Windows version of the software you can take one or more photos and “wrap” them into a Presto! Wrapper.
Once exported to a Presto! Wrapper the files within have a portable viewer wrapped up with them. One double-click and Presto!, you can view, rotate, export, and print your images. The wrapper has a your typical .EXE extension and identifies as such.
sf Presto6-s02.EXE
---
siegfried : 1.11.0
scandate : 2024-01-09T23:39:36-07:00
signature : default.sig
created : 2023-12-17T15:54:41+01:00
identifiers :
- name : 'pronom'
details : 'DROID_SignatureFile_V116.xml; container-signature-20231127.xml'
---
filename : 'Presto6-s02.EXE'
filesize : 818301
modified : 2024-01-07T23:48:01-07:00
errors :
matches :
- ns : 'pronom'
id : 'fmt/899'
format : 'Windows Portable Executable'
version : '32 bit'
mime : 'application/vnd.microsoft.portable-executable'
class :
basis : 'extension match exe; byte match at [[0 2] [232 94]]'
hexdump -C Presto6-s02.EXE | head
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 |MZ..............|
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 |........@.......|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 e8 00 00 00 |................|
00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 |........!..L.!Th|
00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f |is program canno|
00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 |t be run in DOS |
00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 |mode....$.......|
00000080 99 72 8f bf dd 13 e1 ec dd 13 e1 ec dd 13 e1 ec |.r..............|
00000090 5e 0f ef ec dc 13 e1 ec b2 0c eb ec d6 13 e1 ec |^...............|
The preservation of executables is, in my opinion, complicated. Running a 32 bit executable on a computer today might not even work. Then we have to get into the license of using the software and wether the license allows us to use it freely in perpetuity. So as much as this is an executable, knowing it is also a wrapper for regular images is important to know as an option for preservation. The files wrapped inside can be exported and preserved as a solution. So what makes this executable unique. Let’s look a little closer.
It is indeed a wrapper, the header looks like any other EXE file, but a little further into the file we can see some specifics to the viewer. In all my samples I can see the string “NewsSoft Viewer“. That might be enough to distinguish it from other executables. See some samples here.
I guess part of the question is wether identifying specific software executables is needed in preservation. Arn’t they all executables and should be treated similar? This isn’t the first type of executables I have seen like this. awhile back I came across another home software which allowed you to make a slideshow, complete with audio and wrap it into an executable to put on a disk so playback was easy for the user and nothing additional was needed. The software is called Family Album Creator, use at your own risk.
Usually in the software world file formats are fairly efficient, the structure is meant to provide a way to store the data of the software being used. There isn’t much need to add additional unnecessary additions. This isn’t always true, but in the early days, disk space was expensive so compression and efficiency ruled. There also wasn’t much need to hide anything or complicate things. That is unless it is intended. This makes me think of two things, Polyglots and Steganography.
Steganography is the art of embedding data within an image. With digital images you can hide another image within the main image by using the most and least significant bits. Fun use of technology, but not something you normally would find in your regular desktop software.
Imagine my surprise when I was researching the Picture It! software and the MIX file format only to discover Microsoft decided to make their own polyglot of sorts for their PNG Plus format which replaced the MIX format, then both obsolete when Digital Image was discontinued in 2007. The PNG Plus format was the native format for the Microsoft Picture It! and Digital Image software often found with the Microsoft Works or Digital Imaging suite of software.
Save Menu from Digital Image Pro
According to the help within Digital Image:
The PNG Plus format uses the standard PNG extension but provides saving of layers and pages within the PNG format. Since the PNG format cannot do this natively, how did Microsoft accomplish this? Well, by throwing an OLE container into the middle of the file of course!
PNG Plus files are your regular PNG format and will identify as such. But they are just a low resolution thumbnail of the full image. Let’s take a look:
exiftool PictureIt7-s02.png
ExifTool Version Number : 12.70
File Name : PictureIt7-s02.png
File Size : 26 kB
File Modification Date/Time : 2023:12:26 22:01:58-07:00
File Access Date/Time : 2024:01:01 12:31:07-07:00
File Inode Change Date/Time : 2023:12:26 22:01:58-07:00
File Permissions : -rwx------
File Type : PNG
File Type Extension : png
MIME Type : image/png
Image Width : 500
Image Height : 333
Bit Depth : 8
Color Type : RGB with Alpha
Compression : Deflate/Inflate
Filter : Adaptive
Interlace : Noninterlaced
SRGB Rendering : Perceptual
Gamma : 2.2
White Point X : 0.3127
White Point Y : 0.329
Red X : 0.64
Red Y : 0.33
Green X : 0.3
Green Y : 0.6
Blue X : 0.15
Blue Y : 0.06
Warning : [minor] Text/EXIF chunk(s) found after PNG IDAT (may be ignored by some readers)
Title : PictureIt7-s02
Image Size : 500x333
Megapixels : 0.167
Looks like there is some additional data after the IDAT chunk.
What what do we have here? Near the end of the file before the IEND chunk is an OLE file with the very recognizable hex values of “D0CF11E0“. Let’s strip out the OLE file and take a look.
Path = PictureIt7-s02-ole
Type = Compound
WARNINGS:
There are data after the end of archive
Physical Size = 8704
Tail Size = 7764
Extension = compound
Cluster Size = 512
Sector Size = 64
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2023-12-26 22:01:58 D.... DataStore
2023-12-26 22:01:58 D.... Text
..... 2560 2560 Text/CONTENTS
..... 86 128 Text/[1]CompObj
..... 96 128 DataStore/3
..... 4 64 DataStore/1
..... 121 128 DataStore/0
..... 57 64 DataStore/2
..... 98 128 DataStore/5
..... 4 64 DataStore/4
..... 1254 1280 DataStore/7
..... 4 64 DataStore/6
..... 4 64 DataStore/8
------------------- ----- ------------ ------------ ------------------------
2023-12-26 22:01:58 4288 4672 11 files, 2 folders
Interesting, I don’t think I have come across a standard format with a container embedded within. I have come across many OLE and ZIP containers which contain other common formats within, but this format is definitely unique. Others have added features in the IDAT chunk, such as a web shell. I am sure there are others out there. The CompObj file found within the Text directory is very similar to the Microsoft Works and Publisher format. Although trying to open the file in Publisher doesn’t work!
PRONOM uses binary and container signatures to identify file formats. Even though this file format contains a valid OLE container, because it is within a regular binary file format, I don’t believe a container signature would work. The difficulty will be to clearly identify this new format without falsely identifying a regular PNG instead. The OLE file format header is not in a consistent location to use a specific offset. Making the string a variable location can causes some undo processing, so lets look to see if there is anything else we can use to make a positive ID.
The PNG file format is based on chunks, you have to have IHDR, then an IDAT and the IEND chunk. If we take a look at a regular PNG file using a libpng tool pngcheck, we see this:
pngcheck -cvt rgb-8.png
File: rgb-8.png (759 bytes)
chunk IHDR at offset 0x0000c, length 13
256 x 256 image, 24-bit RGB, non-interlaced
chunk tEXt at offset 0x00025, length 44, keyword: Copyright
? 2013,2015 John Cunningham Bowler
chunk iTXt at offset 0x0005d, length 116, keyword: Licensing
compressed, language tag = en
no translated keyword, 101 bytes of UTF-8 text
chunk IDAT at offset 0x000dd, length 518
zlib: deflated, 32K window, maximum compression
chunk IEND at offset 0x002ef, length 0
No errors detected in rgb-8.png (5 chunks, 99.6% compression).
The required chunk are there, but a couple extra, the tEXt and iTXt, which are textual metadata you can add. Now lets look at a PNG Plus file:
pngcheck -cvt PictureIt7-s02.png
File: PictureIt7-s02.png (26066 bytes)
chunk IHDR at offset 0x0000c, length 13
500 x 333 image, 32-bit RGB+alpha, non-interlaced
chunk sRGB at offset 0x00025, length 1
rendering intent = perceptual
chunk gAMA at offset 0x00032, length 4: 0.45455
chunk cHRM at offset 0x00042, length 32
White x = 0.3127 y = 0.329, Red x = 0.64 y = 0.33
Green x = 0.3 y = 0.6, Blue x = 0.15 y = 0.06
chunk IDAT at offset 0x0006e, length 9460
zlib: deflated, 32K window, fast compression
chunk cmOD at offset 0x0256e, length 0
Microsoft Picture It private, ancillary, unsafe-to-copy chunk
chunk cpIp at offset 0x0257a, length 16384
Microsoft Picture It private, ancillary, safe-to-copy chunk
chunk iTXt at offset 0x06586, length 24, keyword: Title
uncompressed, no language tag
no translated keyword, 15 bytes of UTF-8 text
chunk tEXt at offset 0x065aa, length 20, keyword: Title
PictureIt7-s02
chunk IEND at offset 0x065ca, length 0
No errors detected in PictureIt7-s02.png (10 chunks, 96.1% compression).
It looks like we have the required chunks and some textual chunks but also a couple chunks which pngcheck describes as private and identify’s them as Microsoft Picture It chunks. The cpIp chunk is the one which contains the OLE container. This is the chunk we need to identify in a signature. The problem is the offset for the cpIp chunk is not the same each time. Here is one from Digital Image 10 Pro.
chunk cpIp at offset 0x737a7, length 245760
Microsoft Picture It private, ancillary, safe-to-copy chunk
Significantly further in the file that the other example. These samples currently identify as PNG 1.2 files. PRONOM fmt/13 so we can use the signature and add to it, but it currently doesn’t look for IDAT only the iTXt chunk, which is probably not optimal. For PNG Plus, lets get the header which includes IHDR, IDAT, then the cpIp chunk then an end of file sequence for IEND. Take a look at my signature and samples, I am curious how many PNG Plus files are out there hidden to the world.
Turns out there is another PNG flavor which has been enhanced to allow for layers and pages. Adobe Fireworks uses a PNG format as their native format. They also use private chunks, but not within an OLE container. They use additional chunks, but before the IDAT chunk:
It’s hard to know which each of the chunks are for and if they are all required for the Fireworks PNG format. From the book on PNG.
In addition to supporting PNG as an output format, Fireworks actually uses PNG as its native file format for day-to-day intermediate saves. This is possible thanks to PNG’s extensible “chunk-based” design, which allows programs to incorporate application-specific data in a well-defined way. Macromedia has embraced this capability, defining at least four custom chunk types that hold various things pertinent to the editor. Unfortunately, one of them (pRVW) violates the PNG naming rules by claiming to be an officially registered, public chunk type, but this was an oversight and should be fixed in version 2.0.
Most everyone has heard of Microsoft Office, the suite of applications used by millions everyday. Less people know about Microsoft Works, which was a lower cost alternative, but was quite popular as a home office suite of applications. One tool which often came with the Works suite was a digital image tool called Picture It!
Picture It! was a photo editing tool first released by Microsoft in 1996 geared to making photo editing easy and affordable.
Picture It! used a wizard type interface which walked you through acquiring an image and adding to it. One of the key features of the software was the ability to “stack” objects like layers. Because of this feature a new file format was used to save this information to disk. Meet the Microsoft Image (Picture) Extension format, commonly known as the MIX file format. It is very similar to the FlashPix image format, which was supposed to be an image file format to solve many delivery issues, but didn’t seem to gain hold despite being created by Kodak, HP, and others. In fact many of the MIX files I found on Microsoft disks are actually FlashPix files.
The MIX extension was also used by another Microsoft program, PhotoDraw, which causes confusion as they were similar, but PhotoDraw has some added features which may not be compatible with Picture It!. Both formats are based on the Microsoft Compound Object (OLE) container, and have a similar structure. Let’s take a look at a MIX file from Picture It! version 1.
7z l PictureIt1-s02.mix
--
Path = PictureIt1-s02.mix
Type = Compound
Physical Size = 48128
Extension = compound
Cluster Size = 512
Sector Size = 64
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
..... 328 384 [5]Data Object 000001
..... 396 448 [5]Transform 000004
..... 872 896 [5]Operation 000001
..... 320 320 [1]CompObj
..... 292 320 [5]Global Info
..... 872 896 [5]Operation 000002
..... 144 192 [5]Operation 000003
..... 684 704 [5]Transform 000008
..... 1028 1088 [5]Transform 000009
..... 328 384 [5]Data Object 000009
..... 324 384 [5]Data Object 000005
2023-12-27 11:04:39 D.... Data Object Store 000001
..... 328 384 [5]Data Object 000010
..... 20932 20992 [5]SummaryInformation
..... 200 256 [5]Microsoft Embedding Info
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0001
..... 1400 1408 Data Object Store 000001/[5]Image Contents
..... 230 256 Data Object Store 000001/[1]CompObj
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0000
..... 28 64 Data Object Store 000001/Resolution 0000/Subimage 0000 Data
..... 80 128 Data Object Store 000001/Resolution 0000/Subimage 0000 Header
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0003
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0002
..... 28 64 Data Object Store 000001/Resolution 0002/Subimage 0000 Data
..... 208 256 Data Object Store 000001/Resolution 0002/Subimage 0000 Header
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0005
2023-12-27 11:04:39 D.... Data Object Store 000001/Resolution 0004
..... 28 64 Data Object Store 000001/Resolution 0004/Subimage 0000 Data
..... 1792 1792 Data Object Store 000001/Resolution 0004/Subimage 0000 Header
..... 124 128 Data Object Store 000001/[5]SummaryInformation
..... 28 64 Data Object Store 000001/Resolution 0005/Subimage 0000 Data
..... 6976 7168 Data Object Store 000001/Resolution 0005/Subimage 0000 Header
..... 28 64 Data Object Store 000001/Resolution 0003/Subimage 0000 Data
..... 544 576 Data Object Store 000001/Resolution 0003/Subimage 0000 Header
..... 28 64 Data Object Store 000001/Resolution 0001/Subimage 0000 Data
..... 128 128 Data Object Store 000001/Resolution 0001/Subimage 0000 Header
------------------- ----- ------------ ------------ ------------------------
2023-12-27 11:04:39 38698 39872 29 files, 7 folders
This is a simple MIX file with one line of text, but contains a lot of content inside the OLE container. If I try and use the PRONOM registry to identify the file, I get:
sf PictureIt1-s02.mix
---
siegfried : 1.11.0
scandate : 2023-12-27T11:06:32-07:00
signature : default.sig
created : 2023-12-17T15:54:41+01:00
identifiers :
- name : 'pronom'
details : 'DROID_SignatureFile_V116.xml; container-signature-20231127.xml'
---
filename : 'PictureIt1-s02.mix'
filesize : 48128
modified : 2023-12-27T11:04:40-07:00
errors :
matches :
- ns : 'pronom'
id : 'fmt/111'
format : 'OLE2 Compound Document Format'
version :
mime :
class : 'Text (Structured)'
basis : 'byte match at 0, 30'
warning :
Hmm, we know it is an OLE compound document, but it should identify as a Picture It! file as PRONOM has defined a PUID for the format. fmt/936 has been defined as “Microsoft Picture It! Image File 1”. So I am not sure why this file from version 1 is not identifying correctly. Let’s take a look. The PRONOM container signature for fmt/936 is looking for this:
The container signature is looking into the OLE container for the “CompObj” file (which seems to be required), then looks for the string “Microsoft Picture It! version 1 Picture” starting at the 32nd byte. That is pretty specific. The sample file I am using as an example has the following string of bytes.
Ok, so this sample has a similar string but is missing the “version 1” text. It seems the samples used to created the PRONOM signature was working off samples which included the version 1 in the header of CompObj. Maybe when Microsoft learned they would be making a version 2, they decided a version number should be included going forward. Let’s take a look a file from version 2 to compare:
Ok, so it looks like they did update the version string for version 2. This file also does not identify correctly. A quick look at the wikipedia page for Microsoft Picture It! tells us they continued to release the software until version 10. Is there a different string for each version?
Diving into this and gathering many samples has brought a lot of variants to surface. Let’s see if we can list all the CompObj header variants.
Version 1 samples:
Picture It! Picture'{56616800-C154-11CE-8553-00AA00A1F95B}
Microsoft Picture It! Picture'{56616800-C154-11CE-8553-00AA00A1F95B}
Microsoft Picture It! version 1 Picture'{56616800-C154-11CE-8553-00AA00A1F95B}
Picture It! Collage'{56616800-C154-11CE-8553-00AA00A1F95B}
Version 2 samples:
Microsoft Picture It! version 2 Picture'{2D722850-8C4B-11D0-A96F-00A0C905410D}
Version 3 samples:
Microsoft Picture It! version 3 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D}
Version 4 samples:
Microsoft Picture It! version 4 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D}
PhotoDraw version 1 samples:
Microsoft PhotoDraw version 1 Picture'{18B8D020-B4FD-11D0-A97E-00A0C905410D}
PhotoDraw version 2 samples:
Microsoft PhotoDraw version 2 Picture'{18B8D021-B4FD-11D0-A97E-00A0C905410D}
FlashPix samples:
FlashPix Object({56616000-C154-11CE-8553-00AA00A1F95B}
FlashPix Object({56616800-C154-11CE-8553-00AA00A1F95B}
Picture It! FlashPix'{56616700-C154-11CE-8553-00AA00A1F95B}
LPI FlashPix'{56616700-c154-11ce-8553-00aa00a1f95b}
FlashPix_Object'{56616700-C154-11CE-8553-00AA00A1F95B}
'{56616700-C154-11CE-8553-00AA00A1F95B}
Picture It!'{56616700-c154-11ce-8553-00aa00a1f95b}
Flashpix Toolkit Application'{56616700-c154-11ce-0000-000000000000}
Ok, there is a lot to discuss here. First of all, it seems MIX was only used in Picture It! until version 5 (2001), then the Picture It! software used a new format, PNG Plus to store the layered stacks. More on that in a future post! Although some later versions seems to be able to open the older MIX format. Version 4 of the MIX format seems to be the last as the 2001 software had only version 4 files on it. Probably safe to say only the 4 versions are needed for identification.
You may notice the additional unique identifier I included in each format. This is called a Class ID for the OLE format, which A LOT of formats use. Each “format” has a unique ID associated with it to help distinguish it from other formats. This Unique ID could possibly be a better solution for identification. It does cross over with the PhotoDraw format, but the FlashPix format seems to have a unique ID. With all the variations in the version 1 strings, the ID remains the same. For version 3 and 4 the ID is the same, which could mean they are interchangeable. It is also the same as PhotoDraw version 1. Not to complicate things.
So it seems in order to get proper identification of these similar formats we need to:
Clean up version 1 identification for fmt/936
Add a signature for 2, 3, and 4
Add a version 2 signature for the PhotoDraw format
Add some additional signature variations for the FlashPix format.
The Class ID’s could be used to distinguish different versions and formats, but many of the ID’s are identical, this could mean they are the same format. But for now we can just add the additional variation strings and it should identify everything for now. The FlashPix format needs more research as there is so many different variations and it’s so close to the MIX format. Take a look at my GitHub submission, maybe you have some additional variations to add?
One of the important parts about Digital Preservation is to gather significant properties of the digital files we hope to preserve. This can allow us to base our risk assessments off of more data than just an extension. For example, a TIFF file is a mighty good preservation format. Well documented and adopted by the preservation community, and with hundreds if not thousands of software tools to render and make use of the format. But if a TIFF file uses compression like LZW, or if it happens to have multiple pages, those are good things to know about. Most formats might have a stable set of properties, but sometimes can have properties which adds more risk to the format becoming difficult to render or migrate.
A DNG or Digital Negative developed by Adobe was supposed to solve the issues with proprietary RAW digital camera formats. Rendering a PhaseOne IIQ file often times requires the full CaptureOne software which can be expensive. Adobe spends quite a bit of resources in adding support to its Camera RAW toolkit and adding the ability to take majority of these RAW formats and move them into a DNG. There is also more and more camera manufacturers who image directly to a DNG as their native RAW format. This is the case for Apple’s ProRAW format which uses the DNG specification.
Another manufacturer is the Insta360 camera’s. Their 360 camera’s can use two lenses to capture 180 degrees from each and then stitch into a 360 photo or video. They can capture compressed images and videos, but also in RAW. Because of the two lenses and sensors, their DNG’s can get quite large. For this reason I recently asked PRONOM to adjust their signatures to allow for a bigger offset of DNG information in the larger RAW images.
exiftool IMG_20230913_141939_00_039.dng
ExifTool Version Number : 12.70
File Name : IMG_20230913_141939_00_039.dng
File Size : 143 MB
File Type : DNG
File Type Extension : dng
MIME Type : image/x-adobe-dng
Exif Byte Order : Little-endian (Intel, II)
Subfile Type : Full-resolution image
Image Width : 5984
Image Height : 11968
Bits Per Sample : 16
Compression : Uncompressed
Photometric Interpretation : Color Filter Array
Make : Arashi Vision
Camera Model Name : Insta360 X3
DNG files are actually based on the TIFF format, TIFF/EP to be precise, which means there is some good history behind the format and understanding of its structure. DNG does add many new tags and new features, so there is much more going on. Here is a TIFFInfo view of a DNG. Lots of new tags…..
tiffinfo IMG_20230913_141939_00_039.dng
TIFFReadDirectory: Warning, Unknown field with tag 33421 (0x828d) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 33422 (0x828e) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 50937 (0xc6f9) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 50938 (0xc6fa) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 50940 (0xc6fc) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 51009 (0xc741) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 51107 (0xc7a3) encountered.
=== TIFF directory 0 ===
TIFF Directory at offset 0x889946c (143234156)
Subfile Type: (0 = 0x0)
Image Width: 5984 Image Length: 11968
Bits/Sample: 16
Sample Format: unsigned integer
Compression Scheme: None
Photometric Interpretation: 32803 (0x8023)
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 11968
Planar Configuration: single image plane
Make: Arashi Vision
Model: Insta360 X3
Software: v1.0.69_build1
DateTime: 2023:09:13 14:19:40
Tag 33421: 2,2
Tag 33422: 1,2,0,1
EXIFIFDOffset: 0x8
GPSIFDOffset: 0x3e6
DNGVersion: 1,3,0,0
DNGBackwardVersion: 1,3,0,0
UniqueCameraModel: Insta360 X3
An IFD (Image File Directory) is the building block of a TIFF file. A TIFF file can have multiple IFD’s within a single file. But an IFD can also be a thumbnail, metadata or GPS info. For a DNG, they use the IFD structure as well, but often, the first IFD is a lower resolution of the full image.
It can get confusing, especially for tools we use to extract metadata and significant properties from a DNG for preservation. Within Rosetta, the preservation system I use at work, there is no dedicated DNG extractor, so we use JHOVE, as it is the tool we use for our TIFF images. This presents a problem as the process only extracts properties for the first IFD assuming it is the main IFD, but in many cases it reports back the image is much smaller in pixel dimensions than it actually is. More work is needed to improve extracting correct significant properties for DNG and other RAW image formats.
Adobe released a new version of DNG this year. In June, DNG version 1.7.0.0 was finalized. The new version brought a few new features, two of which are including JPEG XL compression and a new HDR colorimetric value. In order to add JPEG XL compression DNG version 1.7 is required. Here is how one looks in exiftool, created with Adobe DNG Converter 16.1.
exiftool _MG_9375_1.dng
ExifTool Version Number : 12.70
File Name : _MG_9375_1.dng
File Size : 5.4 MB
File Type : DNG
File Type Extension : dng
MIME Type : image/x-adobe-dng
Exif Byte Order : Little-endian (Intel, II)
Make : Canon
Camera Model Name : Canon EOS DIGITAL REBEL XT
Preview Image Start : 91884
Orientation : Rotate 270 CW
Rows Per Strip : 171
Preview Image Length : 10305
Software : Adobe DNG Converter 16.1 (Macintosh)
Modify Date : 2023:12:18 11:45:06
Artist : unknown
Image Width : 3516
Image Height : 2328
Bits Per Sample : 16
Compression : JPEG XL
DNG Version : 1.7.1.0
DNG Backward Version : 1.7.1.0
I had recently submitted a new signature for DNG 1.7 to PRONOM, but I found this new DNG version falls outside the signature I created. I had made the assumption all DNG’s report their version based on the last two values of 0.0, so I created the signature to look for 1.7.0.0. This is wrong now that I can see an example of version 1.7.1.0.
In order to fix the issue, I would need to change all the DNG signatures to remove the last two bytes so:
12C601000400000001070000 would change to 12C60100040000000107
This would allow for identification if some DNG files have a point version.
The pace at which manufacturers are producing camera’s with new features is much faster than the Digital Preservation community can keep up with. As new technologies get released, we play catch up trying to identify new formats and variations to existing ones. I guess that is job security?
When it comes to Digital Preservation, the easiest types of file formats to preserve are often single self contained formats with lots of documentation. There are plenty of formats which break this norm, but a file format like a simple TIFF file is well understood and can stand on its own. The hardest file formats to preserve, I have found, are the complex under documented formats which often show up when you don’t expect them. There is a file format type which indeed makes things difficult. The project format.
There are many software tools out there which generate a “Project”, this is often proprietary and can only be used by the software which created it. Project files are also interdependent, meaning they require other files in known locations in order to be used. This interdependence is often links to images, audio, video, fonts, and other multimedia. The file format itself is just a reference to all the project settings and the paths to the files included in the project. This makes things very difficult to preserve and maintain the complex structure required. Any renaming, removing, or moving the files out of their original order can render the project useless. Many project formats are human readable in XML, or other human readable text, but others are not. I have made a recent attempt to document more Project formats on the File Format Wiki, including many Label and Optical disc project formats, along with updates to Adobe InDesign, QuarkXPress and other desktop publishing project formats. There is still plenty of work needed in other Video and Audio project formats.
Apple computers over the years has created some very powerful software for content creators to use, especially in Video editing. iMovie was used by many home movie editors and iDVD to burn those movies to DVD to share with family and friends, but Apple also sold a professional Video Editing suite which included Final Cut Pro.
Final Cut Pro started life as a Macromedia software tool called KeyGrip which never was released and later bought by Apple. Final Cut Pro was well used and loved by video editors and was given a major upgrade in 2011 to Final Cut Pro X, which was full re-written to be 64-bit. This change included a change to the Project file format. So for version 1 through version 7, Final Cut Pro used a project format with the extension .FCP. Lets take a closer look at the this project format.
From the header we can see a remnant of the original KeyGrip software, but later in the file we find some references to files in the Mac HFS path format which includes a colon instead of a slash. These are the paths to the each of the MOV files used in the Project. This file is from the tutorial disk of Final Cut Pro version 1.2, so lets take a look at the last version released, version 7.
Almost identical to the first version, which is helpful for identification, but if we need to identify based on version, it might prove a little more difficult. It appears all the samples I have and have seen reference to all begin with the same 5 hex values, A24B657947, 0xA2 KeyG. It’s hard to know what other hex values might have something to do with versions of the file format. More samples could tell us, but from what I have the 20 bytes starting from offset 12 seems to be consistent among the different version samples. But for now the 5 bytes at the beginning of the file should suffice for identification.
When Final Cut Pro went through a complete re-write in 2011, the FCP format was abandoned. Not only made obsolete, but completely unsupported. The new Final Cut Pro X software was not able to support this now obsolete format. The new format followed the pattern of many other Apple formats of using a folder identified through an extension as a single file. Called a bundle format, Final Cut Pro X used the extension, .FCPBUNDLE. This bundle could include the media assets along with project settings/thumbnails and clips. Because of this “bundle” format, identification would have to be done at the individual file level inside the bundle. This would include formats with extensions such as .flexolibrary and .fcpevent, which appear to be SQLite databases. This complex format makes preservation of this type of object difficult with current methods and practices.
Luckily Apple didn’t leave Final Cut Pro users completely unable to migrate their content. Final Cut Pro could export the project as an XML file. This format is called Final Cut Pro XML Interchange Format and was well documented. The format was not made to bridge the gap from Final Cut Pro to Final Cut Pro X, but rather make the project file more useful outside of Final Cut Pro. Final Cut Pro X actually can’t open these files either, which is why a third party developer came in and developed 7toX (SendtoX) to allow for projects to be converted to a newer XML format.
Lets take a look at the basic Final Cut Pro XML Interchange Format which has a standard XML extension:
A different Doctype/root and structure but should be easy to identify.
The preservation of projects files, according to some, is not necessary since they are not the finalized product. Preserving the finalized output would be preferable as it can be managed easier and represent the final render of a project. But identification of the Final Cut Pro project and all the assets gives the option to access a collection more accurately. I was able to create a signature for the FCP, XML, and FCPXML formats. Take a look on my GitHub for the signatures and some test files.
I often find myself at a thrift store looking through the well used Compact Discs. Often see the same ones over and over, but occasionally finding a gem. While looking through a set of discs, a few caught my eye. When I pulled one out to look at the cover I noticed it was not your typical CD. Opening the cover I was greeted with a 3.5 floppy inside the jewel case. That was a fun surprise.
The 3.5 inch floppy disk appears to be made specifically for the Yamaha Disklavier piano’s. The disk had the appearance of your typical double density floppy. Unfortunately, when I inserted the disk I was greeted with the error, “no mountable file systems”. I was however able to use ddrescue and make a disk image. Here is what the disk header looks like:
The header actually has 512 bytes of the E5 hex values. Not a FAT12 file system for sure. A little farther into the disk image I can see what appears to be file listing.
This gave me a few clues. A few google searches I came across a project on Hackaday about “Hacking Yamaha Disklavier Floppies“. I was eager to test out the python software and see if could export the files on the disk image. To my disappointment, the python script could not read my disk image. So I reached out to the author. After sharing a couple disk images with him, he was able to enable support for this different type of disklavier disk.
python3 disklav.py -t Yamaha_RazzleDazzle.img
Loading file...OK
Format: PianoSoft DOM-30
Disk: PPC 1919
Title: RAZZLE DAZZLE
--------------------------------------------------------------------
Track 01 - Song Without Words
Track 02 - Shortenin' Bread Boogie
Track 03 - Toy Bugle
Track 04 - Sweet Tooth
Track 05 - The Mysterious Violin
Track 06 - Quiet Moment
Track 07 - Razzle Dazzle
Track 08 - Let's Have A Party!
When I used the extract option with the software I was rewarded with eight files with the .FIL extention.
sf Yamaha_RazzleDazzle-track01.fil
---
siegfried : 1.10.1
scandate : 2023-12-04T22:38:04-07:00
signature : default.sig
created : 2023-12-04T22:37:35-07:00
identifiers :
- name : 'pronom'
details : 'DROID_SignatureFile_V116.xml; container-signature-20231127.xml'
---
filename : 'Yamaha_RazzleDazzle-track01.fil'
filesize : 6409
modified : 2023-12-04T22:28:34-07:00
errors :
matches :
- ns : 'pronom'
id : 'UNKNOWN'
format :
version :
mime :
class :
basis :
warning : 'no match'
No surprise the file was not known to PRONOM. I doubt these files have made a big appearance in many archives.
The header of a .FIL file has an ascii string “COM-ESEQ”. A little investigation shed a little light on the format. Turns out there is a proprietary format Yamaha developed called “E-SEQ”. The E-SEQ format is compatible with all Disklaviers, Clavinova digital pianos, and a few other Yamaha products. I was curious if the format was something similar to a MIDI file, which was commonly used with early keyboard systems, but I was unable to find anything to suggest they are similar in any way. Yamaha does mention there are tools out there to convert an E-SEQ file to a SMF (Standard Midi File) which was used on other systems.
There is another tool called PPFBU which can be used to extract a disk image from a Disklavier floppy and the E-SEQ files. Along with a companion tool called MID2PianoCD claims to be able to convert a E-SEQ to a WAV or MP3, although I haven’t had much luck.
Another set of tools are available here, they allow for copying of a disk and converting back and forth from E-SEQ and Midi. A text file in DVUtils from the link has the following background about the disk format and the FIL files:
DISKLAVIER FILES AND DISCS
Yamaha Disklavier discs are always on Double Density (2DD) media, High Density (HD)discs, which are more common nowadays, will not work. Furthermore, they are formatted to 720 Kbytes not the default of 1.2 Mbytes. The original discs are copy protected. This has been achieved by placing invalid data on the first sector. As DOS and Windows always refer to this sector to check out a floppy, they will report that the discs are bad. The Yamaha machinery ignores the first sector so it reads them normally.
The music files on a Disklavier disc have the extension .FIL . They are frequently identified with titles like PIANO001.FIL but sometimes they have names similar to DOS like MUSIC1.FIL. In addition to the music files, there is an index file on the disc. This contains a list of the active music files on the disc, their titles, and pointers to their position on the disc. The index file is always called PIANODIR.FIL and always has a size of 6 Kbytes. In order to set up a Disklavier disc to function on a Disklavier, you must first copy the music files onto it in Disklavier format (ESEQ) and then run the ESEQ EXPLORER program to build the index file.
Although there are many Disklavier Piano’s still out there and quite expensive if you want to pick one up for yourself, the websites dedicated to the format as slowly disappearing. One archived website has plenty of sample files to download and write to a floppy for use if you happen to have a Disklavier.
You can check out the signature I put together for the E-SEQ on my Github. Might be good to explore the disk image format more and add that as well to PRONOM for identification as well.
There are certain file formats which seem to be fairly mainstream and come up frequently from a variety of sources. Then you find one from a specialized niche industry. I recently came across a file with the HUS extension and it led me down a path of a family of formats I didn’t know existed. The world of embroidery formats has been around for quite some time and is quite confusing. There are many manufacturers of embroidery machines and over the years have merged with each other or upgraded to include new features all requiring changes to the file formats used by the systems.
Embroidery stitching designs are a unique file formats. They are images, probably vector, but with stitching information included in the file which determines the color of thread used and pattern used.
I took a data set of many different embroidery files from here and ran it through Siegfried.
That is pretty bleak. Not a single file identified except a few using an OLE container and a couple files which may be plain bitmaps. I have started to document where each of these formats come from on the File Format Wiki, but there are so many formats to research. The triple XXX format research has its other challenges…..
Today I wanted to focus on one brand, Husqvarna, a Swedish company with a long history.
Export Formats in Premier+ Software
Husqvarna has made sewing machines or quite some time. In the late 1970’s machines started to enter the market which were “computerized”. It was in the early 1990’s when the embroidery machines could take a memory card full of stitching patterns, then later could take a floppy disk or USB drive full of custom made designs. Let’s take a look at a few of the formats, starting with the extension .HUS.
First a sample with a 1994 last modified date:
hexdump -C Bow.hus | head
00000000 5b af c8 00 1a 09 00 00 01 00 00 00 98 01 67 01 |[.............g.|
00000010 6f fe 9b fe 2c 00 00 00 47 00 00 00 0e 07 00 00 |o...,...G.......|
00000020 00 00 00 00 00 00 00 00 00 00 07 00 00 10 38 84 |..............8.|
00000030 81 af f8 d9 6e bc 5e c1 b4 1d fc b2 00 00 00 0c |....n.^.........|
00000040 93 03 49 98 00 e5 f0 07 41 77 75 c6 49 c6 ff e2 |..I.....Awu.I...|
00000050 80 aa aa 08 00 28 82 a8 b2 49 27 5e dd ae ba ee |.....(...I'^....|
00000060 db 78 05 bc 4b ef 00 37 f0 da db b5 d6 ee 93 9e |.x..K..7........|
00000070 55 15 01 00 44 00 05 51 01 3e 16 cf db ce 2c bc |U...D..Q.>....,.|
00000080 ad db 48 97 cb e5 f2 fe fc 63 79 e7 a3 a1 46 80 |..H......cy...F.|
00000090 5c 37 f0 10 41 43 a1 40 21 c7 a5 d2 6e 82 0a 20 |\7..AC.@!...n.. |
There is a great python library which can read in many of these formats and can give us some confirmation of the format headers. It is called pyembroidery and has readers for HUS and VP3.
One other format associated with the Husqvarna Viking Designer 1 model which used a floppy disk is the SHV stitch file. This is a special format which could only be used on the embroidery machine if it was properly formatted. This would put into a structure that would look like this:
The SHV file is the design file with the MHV and PHV are used for display on the embroidery machine. This is what we find when we look at the SHV content:
The MHV and PHV files have the same header so identification of just the SHV design file will be difficult.
I have only scratched the surface with these embroidery formats. The embroidery market was quite large with many different manufacturers. The user base seems to have been quite large, but its reach may be limited to personal collections. I have attempted a signature for the Husqvarna formats, you can find them in my GitHub repository. More to come hopefully!
I wonder sometimes what goes through a software/hardware developers mind when deciding a format to use for a new device. There are so many options our there for audio formats to choose from. I am sure there are pros and cons to using one technology over another but it seems a few decide to go ahead and make their own. I am sure there is some commercial advantage to developing a proprietary audio format, but with all the established choices it seems unnecessary.
Sony developed their own audio compression formats, which I explored in an earlier blog post. I came across a small goofy looking RCA voice recorder, model VR6320.
Many of these RCA VR series recorders can record in a WAV or a VOC file format. The WAV files are pretty run of the mill, but the VOC format is unique to RCA recorders.
The VOC format is not to be confused with another audio format with the same extension. The Creative Voice Format is a bit more well known. It was used with the Creative’s sound cards (Sound Blaster family) many folks had in their Windows computers in the 1990’s. But the RCA file format is different, and because of the same extension needs its own identification so they are not confused with each other.
sf REC00001.VOC
---
siegfried : 1.10.1
scandate : 2023-11-19T23:33:47-07:00
signature : default.sig
created : 2023-05-12T09:10:13Z
identifiers :
- name : 'pronom'
details : 'DROID_SignatureFile_V112.xml; container-signature-20230510.xml'
---
filename : 'REC00001.VOC'
filesize : 47231
modified : 2015-01-09T20:51:10-07:00
errors :
matches :
- ns : 'pronom'
id : 'UNKNOWN'
format :
version :
mime :
class :
basis :
warning : 'no match; possibilities based on extension are fmt/1736'
The RCA VOC file format seems to be undocumented, there isn’t much available. You can always download a copy of the RCA Digital Voice Manager software, which may or may not run on your current system, and convert the VOC files to WAV or you can use a piece of software coded in 2008 called “devoc“. The developer used to have an online website you could upload the VOC to and it would convert it automatically, but is not longer available. The code can also be found here.
Let’s take a look at the header of a couple of the files I have:
Most of samples I have show “VCP162_VOC_File” in the header, but I have one sample with “RP5120_VOC_File“. I have heard of others, one being “V432_Voice_File“. There could be more variations. One could assume the header is somehow associated with the model number of the device, but that doesn’t appear to be the case. Although there is a device with the model number “RP 5120“. It might be that the older RP series get one header and the newer VR Series get VCP? I will need more samples to confirm, if you have any send them my way. Also, according to the manuals, there is a SP and LP mode to manage the bitrate of the file to squeeze more minutes on the built in memory of these devices. This doesn’t appear to affect identification, but might be good to differentiate in the future.
For now you can take a look at the signature on my GitHub page.
All I had to go on was it was an Adobe format and the acronym “ACD”. One of the first results that came up in a google search was a post in the Adobe forums with someone asking what to do with some old ACD and ACI files they found on a disc, circa 2000, labeled “Adobe Capture”. The only thing I remember about Adobe Capture was some scanning tools related to Adobe Acrobat, but I didn’t remember coming across any ACD files related to Acrobat.
Initially it wasn’t easy to find more information on this format. Eventually I was able to narrow it down to stand-alone software adobe released called “Adobe Acrobat Capture”. Originally released in 1995 it was eventually discontinued in 2010. The software was marketed under the ePaper name and connected to Acrobat through the creation of a PDF from scanned images. The software was compatible with many scanner models and would process the scanned images, run Optical Character recognition, and export to a searchable PDF. These tools are built into Adobe Acrobat today.
One of the reasons the software was being so elusive is the fact it was sold with a high price tag and required the use of a hardware key, or dongle, in order to process scans. The hardware key also managed the type of license you purchased which may limit the number of pages you are allowed to scan within a certain period of time. So the software is very difficult to run today, if you do happen to find a copy out there in Internet land.
In order to document these file formats for preservation purposes I needed to find some samples. I was excited to find a demonstration CD on the Internet Archive, but unfortunately it contained no examples of the ACD file format.
A little sleuthing on the Wayback Machine helped me find a few user guides and brochures. I was also able to find there was three versions of Adobe Acrobat Capture. In a Product Brochure, you can see a screenshot of the software with a document open with the ACD extension.
If you are OCD like me you might have noticed the window in this screenshot is typical of the older Windows 3.1 or Windows NT system. So this was indeed an older product released by Adobe.
The Adobe Acrobat Capture 3.0 Demonstration CD-ROM from the Internet Archive luckily has a UserGuide PDF on the disc and was able to help me understand the ACD format a little more.
Looks like the ACD format is an intermediate format used by the software to manage the process between scanning and export to PDF. ACD was also defined as an “Acrobat Capture Document” which makes sense. They were also mentioned as being “multipage files in Acrobat Capture Document (ACD)”. The UserGuide also mentioned an ACP format which it referenced as “one-page files are in Acrobat Capture Page (ACP) format.” So more research is needed.
Lets start with Adobe Acrobat Capture 2.0 as I managed to get a few samples from an installer I found. Here is a hexdump of an ACD file and its corresponding ACI file.
The ACD file is unique, PRONOM and even TrID was unaware of the format. But to the keen observer, the ACI format is very recognizable. You may have seen this header before:
Lets take a closer look at an ACI file to see if they are a true TIFF image or if there is any customization to the format.
Looks like a true TIFF image with no special tags or unique properties. They are 1-bit TIFF’s compressed with CCITT RLE. Not sure there would be any need to create a special signature for these ACI files.
Looking closer at the ACD file format, we can see they reference ACI files, so probably safe to assume the ACD file doesn’t contain the full raster data for each image:
From the limited sample set I have access, all the ACD files begin with the same Hex values, “02044747C900”. Along with the common header we can assume there should be at least one ACI file referenced in the first part of the file. Because it is referenced as a filepath, the ACI string would be variable in its offset.
Adobe Acrobat Capture 3.0 turns out to be a different format. But looks familiar………
The ACD has some of the same hex values as the previous version, but with some extra bytes at the beginning and it looks like the ACP is a straight up PDF. But may have some interesting tags, like “CAPT_info”.
The problem we will face when trying to write a signature for this version of ACD is the container signature needs a static file name to reference, and it appears the name of the container is also the name of the ACD file within the container. So every file will be different. I wish there was a way in the PRONOM signature syntax to reference an extension and ignore the filename, but currently there no method to do this. The only thing inside the container which seems to be consistent is the file “FILES.LST”. So lets take a peek inside if it.
hexdump -C FILES.LST | head
00000000 5b 41 43 44 31 5d 0d 0a 49 53 43 4f 4d 50 4f 53 |[ACD1]..ISCOMPOS|
00000010 49 54 45 3d 54 52 55 45 0d 0a 4e 55 4d 46 49 4c |ITE=TRUE..NUMFIL|
00000020 45 53 3d 31 0d 0a 46 49 4c 45 4e 41 4d 45 31 3d |ES=1..FILENAME1=|
00000030 43 6f 6e 74 72 61 63 74 2e 61 63 70 0d 0a |Contract.acp..|
Ok, there seems to be some static information that is unique to the ACD format. I bet the string “[ACD1]” would be sufficient enough to make a solid signature.
This is a good format example of a limited amount of information on the file format used by a well known company which has become obsolete and disappeared. Take a look at my signatures, maybe you have some old ACD files you were unaware of!
I have to admit, often when I am researching file formats I can get distracted by a shinier format I come across. I often go down rabbit holes and forget the reason I started down the path I am on. I try and focus on the current needs in my life as a Digital Preservation Manager, but can get easily sidetracked. I always look forward to November every year so I can celebrate World Digital Preservation Day which sometimes comes along with a PRONOM research week. This gives me a chance to look at formats that may need attention which are not normally on my radar.
This week I a taking a look again at Multiplan. There is a PRONOM PUID for version 4, but does not have a description nor does it have a binary signature. It is was also lacking a File Format Wiki entry. So I decided to dive in. I had already bumped into the format while doing some research on early Microsoft Excel formats. This includes the SYLK format which needed a little update.
Microsoft Multiplan was the parent of Microsoft Excel. Multiplan was built for many different types of computers in the 1980’s, but was never ported to Windows. So to use Multiplan you have to be comfortable with using DOS. If you want to take Multiplan for a spin, head over to PCjs Machines and load up one of the many emulated systems they have.
In the end, Multiplan had four versions, but the last one, version 4.2, had some big changes, especially to the file format. More on that in a minute.
The DOS files for Version 2 begin with 0CEC0000 08AB0800, but a file for the Xenix system starts with 0AEC0000 08AB0800. So it appears the first byte may be different depending on the system.
The DOS files for Version 3 begin with a similar hex pattern, 0CED0000 08AB0800. This would make sense as the documentation for Multiplan 4.2 states it supports opening of Version 2 & 3, but not Version 1.
There was also a companion product that went along with Multiplan, it was called Microsoft Chart. Here is a file from version 3:
The Chart file format has a similar byte pattern with the 08AB pattern and looks similar to the BIFF format. We will have to make sure it doesn’t conflict with any signatures so it can be identified separately.
Version 4 of Multiplan was the first to use the BIFF (Binary Interchange File Format). Technically Version BIFF2, not much is know about BIFF1 or if it ever existed. BIFF2 is the exact same format as Excel 2.0 used, so there will be some problems if we want to identify them separately. They currently identify as fmt/55.
You can see in the hex values above a difference of two bytes in the header. The reason the Multiplan file identifies as an Excel 2 file is the PRONOM signature ignores those two bytes and allows them to be anything. Some specifications say these aren’t used, but clearly there is a use for them. We could probably use the same signature for Multiplan, but include the two bytes, then set the priority to the Multiplan signature.
The hex values for the first 4 bytes have a similar pattern. 0CEF, Which seems to be in sequence where Version 3 left off. Microsoft calls this new format, New or Normal Binary File Format. They claim it is “the fastest loading and fastest saving file format ever“! Exciting as the new format probably was, it didn’t last long. Multiplan was phased out so Excel could shine.
When I was younger I didn’t use DOS very often because the computer my father brought home in the mid 1980’s was a Macintosh. I use DOS more now in my research then I did when I was younger. Using the DOS interface is not easy. There are a lot of key commands you need to know intuitively just to navigate, but it is fascinating to see how far software has come. Early Excel, Multiplan, and Chart were all intertwined, but hopefully combing through all of these samples can bring some clarity. Take a look at the draft signature I made and all the samples that go with it on my GitHub page.