CD Architect

Receiving electronic media from an outside source can be an adventure. Often times you find yourself sorting the valuable files and separating them from the chaff. There can be hidden files, cache files, application files, drivers, and everything in between. Determining what formats are important can sometimes be difficult, especially if you don’t know the file format of some of the files.

I was recently working on a collection of files which had been produced through some audio software. When working with audio, a WAVE file is what is usually kept as they contain the actual audio data. With these files they came with a couple other formats. One of those formats was a bunch of SFK peak files. These files are meant to be temporary as they are generated from the WAVE file to make opening of audio data faster. They are important, but can easily be regenerated. One could argue they have historical value, but also they don’t contain anything that can be used by itself, so alone they don’t have much value.

The other format found with the WAVE files have a CDP extension. These came up as unknown when using DROID. It is not a common extension so finding the name of the software which created the files wasn’t too hard. Let’s take a look at one of them.

hexdump -C tutor1.cdp | head
00000000 52 49 46 46 79 03 00 00 53 46 50 4a 66 6d 74 20 |RIFFy...SFPJfmt |
00000010 18 00 00 00 00 00 01 00 02 00 00 00 10 00 00 00 |................|
00000020 44 ac 00 00 03 00 00 00 01 00 00 00 4c 49 53 54 |D...........LIST|
00000030 88 00 00 00 66 6c 73 74 66 69 6c 65 23 00 00 00 |....flstfile#...|
00000040 44 3a 5c 53 6f 75 6e 64 73 5c 4e 65 77 20 54 75 |D:\Sounds\New Tu|
00000050 74 6f 72 20 66 69 6c 65 73 5c 53 6f 6e 67 33 2e |tor files\Song3.|
00000060 77 61 76 00 66 69 6c 65 23 00 00 00 44 3a 5c 53 |wav.file#...D:\S|
00000070 6f 75 6e 64 73 5c 4e 65 77 20 54 75 74 6f 72 20 |ounds\New Tutor |
00000080 66 69 6c 65 73 5c 53 6f 6e 67 32 2e 77 61 76 00 |files\Song2.wav.|
00000090 66 69 6c 65 23 00 00 00 44 3a 5c 53 6f 75 6e 64 |file#...D:\Sound|

Huh, this is a RIFF file. RIFF is most commonly used as the container used for WAVE and AVI files. You can read more about the RIFF format on a previous post. The RIFF container format can be used for all sorts of things. Looking at the internals we can see a few unique list chunk’s.

Lots of references to other files, specifically WAVE files. But not a lot of actual data. That is because this format turns out to be just a project format for some software called “CD Architect“. Sonic Foundry was an audio software developer for a few years before they sold their catalog to Sony in 2003. In looking at the manual for CD Architect version 5.2, it explains the CDP Project format.

CD Architect software handles the organization of your CD using a small project file (CDP) that saves information about source file locations, edits, cuts, and insertion points. This project file is not a multimedia file, but is instead used to create the CD when editing is finished.

Looking at another CDP file from the collection, I noticed something different.

hexdump -C CDArch50a-s01.cdp | head
00000000 72 69 66 66 2e 91 cf 11 a5 d6 28 db 04 c1 00 00 |riff......(.....|
00000010 20 0a 00 00 00 00 00 00 84 38 15 b3 da 08 85 44 | ........8.....D|
00000020 b2 2a 5b 70 a1 32 15 ff 5a 2d 8f b2 0f 23 d2 11 |.*[p.2..Z-...#..|
00000030 86 af 00 c0 4f 8e db 8a 00 02 00 00 00 00 00 00 |....O...........|
00000040 78 00 00 00 00 00 04 00 11 00 00 00 44 ac 00 00 |x...........D...|
00000050 00 00 00 00 00 c0 52 40 00 00 00 00 00 00 5e 40 |......R@......^@|
00000060 00 00 00 00 00 00 00 00 04 00 04 00 40 00 00 00 |............@...|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 7c 00 00 00 |............|...|
00000080 50 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 |P...............|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|

That’s odd, the RIFF format is always uppercase ASCII, this is lowercase. Also the important RIFF form, which was “SFPJ” in the other sample, is missing. This is not a valid RIFF format.

But further down in the file I can see the same list chunks. Did they take RIFF format and make a proprietary version of their own? I think they may have. It seems the first example was from CD Architect version 4 and these other files are from CD Architect version 5. That complicates things. Sony stopped developing CD Architect after version 5.2d and maintained it for a few years before selling many of their titles to MAGIX Software. As far as I know there was never any new versions released. The software was very popular, as it had some really nice audio mastering features and was easy to use. Many were upset when the software was abandoned.

Creating a signature for both version 4 and version 5 CDP files will be pretty straightforward. I feel knowing what you have in a collection you are processing is the first step in making informed decisions. Wether or not you keep the project files are up for debate. Some may only want the final audio created from a CD Architect project, while others may want to see the way the audio was put together and mixed. Either way, the more you know…..

One more thing. CD Architect would default to saving a CDP project file, but could also save a “CD Image file”. This process actually would save the project to a full WAVE file with some extras baked in.

An image file is essentially a wave file with volume, crossfades, effects, mixes, and track information embedded. Burning an image file will reduce the risk of buffer underruns (especially if you have a complex project or are using a slow computer) since no audio processing is required. 

Interesting, normally when working with track information in a single WAVE file you would need a companion CUE Sheet in order to reference the track layout of the Audio CD. So I am curious how they do all of this. Lets take a look at a “CD Image”.

mediainfo CDArch52d-s02.wav
General
Complete name : CDArch52d-s02.wav
Format : Wave
Format settings : PcmWaveformat
File size : 5.05 MiB
Duration : 30 s 0 ms
Overall bit rate mode : Constant
Overall bit rate : 1 411 kb/s
Conformance errors : 2
RIFF : Yes
General compliance : File size 5292434 is less than expected size 5292823 (offset 0x8)
WAVE : Yes
General compliance : Element size 5292811 is more than maximal permitted size 5292422 (offset 0xC)

Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 30 s 0 ms
Bit rate mode : Constant
Bit rate : 1 411.2 kb/s
Channel(s) : 2 channels
Sampling rate : 44.1 kHz
Bit depth : 16 bits
Stream size : 5.05 MiB (100%)

Already seeing some issues with the format, but all the important bits are there. JHOVE doesn’t like them much either.

JhoveView (Rel. 1.32.0, 2024-09-12)
Date: 2024-12-11 16:01:08 MST
RepresentationInformation: CDArch52d-s02.wav
ReportingModule: WAVE-hul, Rel. 1.8.3 (2024-03-05)
LastModified: 2024-12-11 15:58:02 MST
Size: 5292434
Format: WAVE
Status: Not well-formed
SignatureMatches:
WAVE-hul
InfoMessage: Ignored unrecognized list type: "pqls"
ID: WAVE-HUL-15
Offset: 5292044
ErrorMessage: Unexpected end of file: Bytes missing = 389
ID: WAVE-HUL-3
Offset: 5292434
MIMEtype: audio/vnd.wave; codec=1
Profile: PCMWAVEFORMAT

JHOVE is giving me two issues. The major error is the file appears truncated according to both MediaInfo and JHOVE. The InfoMessage which is less of an issue but more of a heads up that the WAVE file has an extra LIST type. “PQLS”, which was also in the CPD RIFF file we looked at earlier. So it seems by making a “CD Image” of a project embeds the project chunk data into the WAVE container. Identification is not an issue as these WAVE’s follow the standard pattern and therefore identify correctly, but one might want to be aware through further characterization these WAVE’s have some not so obvious extra data.

My attempts to find any samples from version 3 of CD Architect have failed. Until then, my proposal is to add version 4 & 5 to PRONOM with the signature on my Github page. There you will find a few samples as well.

Interactive Quicktime

One of my favorite legacy formats to explore is any type of multimedia CD-ROM. The 1990’s and early 2000’s were filled with all sorts of multimedia for CD, Web, and Television. It is also one of the most difficult formats to try and preserve for the future. Many CD-ROM’s are filled with executables and/or Macromedia Director media, later having flash content. The operating systems and security needs today make playback almost impossible. For this reason many have built emulation services to mimic the original operation system and software to allow the many historic multimedia CD-ROM’s to once again interact with the user in a way many current systems still struggle with.

Many CD-ROM’s would come as Hybrid disc’s allowing them to be used on a Windows and Macintosh system, sometimes providing two different experiences. Then there were CD-Extra or Enhanced CD‘s as a separate session to an Audio CD which would contain bonus content playable only on a computer.

For fun I took a look back at some of my older Audio CD titles. I came across a couple, one claiming to be a “CD-Extra” and another an “Enhanced CD“. The CD-Extra disc when queried with cd-info claimed to have 12 tracks, with the 12th being a data XA track.

Disc mode is listed as: CD-ROM Mixed
CD-ROM Track List (1 - 12)
#: MSF LSN Type Green? Copy? Channels Premphasis?
1: 00:02:00 000000 audio false no 2 no
2: 02:13:66 009891 audio false no 2 no
3: 05:21:28 023953 audio false no 2 no
4: 08:18:19 037219 audio false no 2 no
5: 12:28:37 055987 audio false no 2 no
6: 16:11:58 072733 audio false no 2 no
7: 19:21:56 086981 audio false no 2 no
8: 23:17:49 104674 audio false no 2 no
9: 26:01:17 116942 audio false no 2 no
10: 28:30:02 128102 audio false no 2 no
11: 31:07:70 139945 audio false no 2 no
12: 37:29:46 168571 XA true no
170: 51:35:07 231982 leadout (520 MB raw, 516 MB formatted)
CD Analysis Report
CD-Plus/Extra
session #2 starts at track 12, LSN: 168571

Mounting the 12th track showed a mix of Macromedia Director (.DIR) files and quite a few Quicktime MOV movies. Playback was not possible on my current computer so I had to resort to using an emulator to experience this bonus content, full of band member photos and biographies.

The other disc I pulled out to explore was a bit different. Using cd-info the disc looked very similar:

Disc mode is listed as: CD-ROM Mixed
CD-ROM Track List (1 - 13)
#: MSF LSN Type Green? Copy? Channels Premphasis?
1: 00:02:00 000000 audio false no 2 no
2: 04:20:08 019358 audio false no 2 no
3: 08:04:27 036177 audio false no 2 no
4: 11:15:62 050537 audio false no 2 no
5: 14:54:32 066932 audio false no 2 no
6: 19:57:73 089698 audio false no 2 no
7: 26:12:36 117786 audio false no 2 no
8: 29:51:59 134234 audio false no 2 no
9: 34:44:00 156150 audio false no 2 no
10: 39:36:62 178112 audio false no 2 no
11: 42:06:01 189301 audio false no 2 no
12: 45:42:26 205526 audio false no 2 no
13: 57:10:54 257154 XA true no
170: 72:56:67 328117 leadout (735 MB raw, 730 MB formatted)
CD Analysis Report
CD-Plus/Extra
session #2 starts at track 13, LSN: 257154

The disc’s, even though were labeled CD-Extra and Enhanced CD, had the same structure and format. The difference was in the type of multimedia used. There was a simple application which launched Quicktime and loaded a single MOV movie. But, this was not your regular Quicktime Movie, this is a highly complex Interactive Quicktime movie.

The Quicktime movie could only be launched from an older operating system using Quicktime 6, and on the Macintosh, only a PPC CPU. The movie would launch with an interactive menu, allowing navigation as you might find on a DVD or Flash website, but all within a single MOV file. When I ran MediaInfo on the MOV file I got back quite a few tracks:

<media ref="/Volumes/VOLCANOECD/ALECD.mov">
<track type="General">
<VideoCount>10</VideoCount>
<AudioCount>1</AudioCount>
<OtherCount>51</OtherCount>
<FileExtension>mov</FileExtension>
<Format>QuickTime</Format>
<Format_Settings>Compressed header</Format_Settings>

Ten video tracks and 51 other tracks. Exploring with Quicktime, I could see the entire list of embedded content:

Quicktime movies, an Audio track, dozens of Flash, Photos, Animations, Sprites, with the possibility of more. These types of Quicktime files had requirements in order to run with Quicktime 6 being the last which could playback all the content correctly. Current versions of Quicktime give a warning on the lack of compatibility.

This Interactive Quicktime movie proudly claims; “Made with LiveStage Pro“, which was an authoring environment for Quicktime made by Totally Hip Software Inc. Started in 1995, but seemed to disappear after 2004 with no new development and by 2014 the website went offline.

If you would like to see a couple of Apple created simple examples see here.

LiveStage Pro was a very powerful authoring tool in its time, another similar tool called Electrifier competed for the interactive Quicktime market. Adobe GoLive also competed, but offered fewer features. The final Quicktime movie exported from LiveStage Pro was the main component, but the software did save a project format with the extension “LSD”. Versions 2 through 4 of LiveStage Pro had a similar header.

hexdump -C LiveStagePro4-s01.lsd | head
00000000 4c 53 41 46 00 00 00 04 00 00 09 16 00 00 00 00 |LSAF............|
00000010 00 00 00 00 00 00 00 00 00 00 09 0a 73 65 61 6e |............sean|
00000020 00 00 00 01 00 00 00 03 00 00 00 00 00 00 00 18 |................|
00000030 56 53 4e 6e 00 00 00 01 00 00 00 00 00 00 00 00 |VSNn............|
00000040 00 00 00 04 00 00 08 84 4d 50 52 4e 00 00 00 01 |........MPRN....|
00000050 00 00 00 49 00 00 00 00 00 00 00 21 6d 4f 55 54 |...I.......!mOUT|
00000060 00 00 00 01 00 00 00 00 00 00 00 00 55 6e 74 69 |............Unti|
00000070 74 6c 65 64 2e 6d 6f 76 00 00 00 00 18 57 6c 65 |tled.mov.....Wle|
00000080 66 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |f...............|
00000090 00 00 00 00 18 57 74 6f 70 00 00 00 01 00 00 00 |.....Wtop.......|

All the samples from version 2 through 4 have the first four bytes as “LSAF“. It also seems the next four bytes may be version related. Version 1 however has a different header.

hexdump -C contest.lsd | head
00000000 4c 53 50 72 00 00 00 08 00 00 00 00 00 00 02 80 |LSPr............|
00000010 01 e0 00 00 00 00 02 58 00 00 00 01 00 00 00 01 |.......X........|
00000020 00 00 00 02 00 00 00 00 00 08 00 00 00 00 00 00 |................|
00000030 00 00 08 53 02 d9 ff c9 04 76 02 97 01 00 44 00 |...S.....v....D.|
00000040 0b 02 fb 03 c9 00 00 00 01 00 00 00 01 00 00 00 |................|
00000050 00 07 41 63 74 69 6f 6e 73 00 00 00 00 00 00 00 |..Actions.......|
00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000070 00 00 00 00 00 00 00 00 05 00 00 00 01 50 49 43 |.............PIC|
00000080 54 ff ff 00 00 c1 ff 03 72 65 64 65 6e 6e 41 79 |T.......redennAy|
00000090 98 05 41 77 78 00 00 01 7a 00 10 00 00 31 fc 30 |..Awx...z....1.0|

Identification of a LiveStage project should be simple enough, but identifying and rendering back a Quicktime movie made by this software takes some work. In fact there are many “Enhanced CD’s” and CD-Extra titles out there with quite a few system requirements. If we are not careful, many of these little gems might get more difficult to experience or lost completely.

If you would like to explore the Quicktime Movie from the Enhanced CD mentioned here, send me a message. You can also take a look at my signature proposal and samples files on my Github for LiveStage.

Final Cut Pro

When it comes to Digital Preservation, the easiest types of file formats to preserve are often single self contained formats with lots of documentation. There are plenty of formats which break this norm, but a file format like a simple TIFF file is well understood and can stand on its own. The hardest file formats to preserve, I have found, are the complex under documented formats which often show up when you don’t expect them. There is a file format type which indeed makes things difficult. The project format.

There are many software tools out there which generate a “Project”, this is often proprietary and can only be used by the software which created it. Project files are also interdependent, meaning they require other files in known locations in order to be used. This interdependence is often links to images, audio, video, fonts, and other multimedia. The file format itself is just a reference to all the project settings and the paths to the files included in the project. This makes things very difficult to preserve and maintain the complex structure required. Any renaming, removing, or moving the files out of their original order can render the project useless. Many project formats are human readable in XML, or other human readable text, but others are not. I have made a recent attempt to document more Project formats on the File Format Wiki, including many Label and Optical disc project formats, along with updates to Adobe InDesign, QuarkXPress and other desktop publishing project formats. There is still plenty of work needed in other Video and Audio project formats.

Apple computers over the years has created some very powerful software for content creators to use, especially in Video editing. iMovie was used by many home movie editors and iDVD to burn those movies to DVD to share with family and friends, but Apple also sold a professional Video Editing suite which included Final Cut Pro.

Final Cut Pro started life as a Macromedia software tool called KeyGrip which never was released and later bought by Apple. Final Cut Pro was well used and loved by video editors and was given a major upgrade in 2011 to Final Cut Pro X, which was full re-written to be 64-bit. This change included a change to the Project file format. So for version 1 through version 7, Final Cut Pro used a project format with the extension .FCP. Lets take a closer look at the this project format.

hexdump -C Swing.fcp | head
00000000  a2 4b 65 79 47 0a 0d 0a  00 00 00 00 20 fc c5 5b  |.KeyG....... ..[|
00000010  00 de b3 11 d0 93 19 00  05 02 18 66 07 00 00 00  |...........f....|
00000020  03 00 00 00 00 00 00 00  00 01 00 00 00 00 01 00  |................|
00000030  00 00 11 07 73 75 62 74  79 70 65 00 00 00 01 01  |....subtype.....|
00000040  00 00 00 03 00 06 4e 4f  55 4e 44 4f 00 00 00 00  |......NOUNDO....|
00000050  01 01 00 00 00 00 00 00  00 00 00 00 00 07 52 55  |..............RU|
00000060  4e 54 49 4d 45 00 00 00  00 01 01 00 00 00 00 00  |NTIME...........|
00000070  00 00 00 01 07 76 69 65  77 65 72 73 00 00 00 00  |.....viewers....|
00000080  01 01 00 00 00 00 00 00  00 00 00 00 00 00 00 08  |................|
00000090  63 68 69 6c 64 72 65 6e  00 00 00 00 01 01 00 00  |children........|
*
00000e30  00 00 00 00 00 00 00 00  00 00 00 00 00 00 07 8c  |................|
00000e40  b3 2e 56 40 4d 6f 6f 56  54 56 4f 44 00 02 00 02  |..V@MooVTVOD....|
00000e50  00 00 00 11 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000e60  00 00 00 0b 44 61 6e 63  65 20 53 68 6f 74 73 00  |....Dance Shots.|
00000e70  00 01 00 08 00 00 07 8a  00 00 07 84 00 02 00 2f  |.............../|
00000e80  41 54 54 4f 20 52 41 49  44 30 20 47 72 6f 75 70  |ATTO RAID0 Group|
00000e90  3a 54 55 54 4f 52 49 41  4c 3a 44 61 6e 63 65 20  |:TUTORIAL:Dance |
00000ea0  53 68 6f 74 73 3a 49 6e  74 72 6f 2e 6d 6f 76 00  |Shots:Intro.mov.|
00000eb0  00 09 00 a8 00 a8 61 66  70 6d 00 00 00 00 00 03  |......afpm......|
00000ec0  00 18 00 39 00 59 00 75  00 95 00 9e 07 49 4c 31  |...9.Y.u.....IL1|
00000ed0  20 33 72 64 00 00 00 00  00 00 00 00 00 00 00 00  | 3rd............|
00000ee0  00 00 00 00 00 00 00 00  00 00 00 00 00 0f 77 61  |..............wa|
00000ef0  6c 74 d5 73 20 43 6f 6d  70 75 74 65 72 00 00 00  |lt.s Computer...|
00000f00  00 00 00 00 00 00 00 00  00 00 00 00 00 10 41 54  |..............AT|
00000f10  54 4f 20 52 41 49 44 30  20 47 72 6f 75 70 00 00  |TO RAID0 Group..|
00000f20  00 00 00 00 00 00 00 00  00 07 77 73 68 69 72 65  |..........wshire|
00000f30  73 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |s...............|
00000f40  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000f50  00 00 00 00 00 00 00 00  00 00 00 00 ff ff 00 00  |................|
00000f60  00 00 00 00 00 00 00 10  41 54 54 4f 20 52 41 49  |........ATTO RAI|
00000f70  44 30 20 47 72 6f 75 70  00 00 00 00 00 00 00 2b  |D0 Group.......+|
00000f80  00 00 00 01 00 00 00 03  00 00 00 03 54 55 54 4f  |............TUTO|
00000f90  52 49 41 4c 00 44 61 6e  63 65 20 53 68 6f 74 73  |RIAL.Dance Shots|
00000fa0  00 49 6e 74 72 6f 2e 6d  6f 76 00 00 00 00 00 00  |.Intro.mov......|

From the header we can see a remnant of the original KeyGrip software, but later in the file we find some references to files in the Mac HFS path format which includes a colon instead of a slash. These are the paths to the each of the MOV files used in the Project. This file is from the tutorial disk of Final Cut Pro version 1.2, so lets take a look at the last version released, version 7.

hexdump -C Lesson 1 Project.fcp | head
00000000  a2 4b 65 79 47 0a 0d 0a  01 de 00 00 00 20 08 92  |.KeyG........ ..|
00000010  66 c4 28 d7 11 8a e5 00  30 65 ec fe 98 03 00 00  |f.(.....0e......|
00000020  00 00 00 00 00 00 00 00  00 01 00 00 00 00 01 15  |................|
00000030  00 00 00 07 73 75 62 74  79 70 65 01 00 00 00 01  |....subtype.....|
00000040  03 00 00 00 00 06 4e 4f  55 4e 44 4f 00 00 00 00  |......NOUNDO....|
00000050  01 01 00 00 00 00 00 00  00 00 00 00 00 07 52 55  |..............RU|
00000060  4e 54 49 4d 45 00 00 00  00 01 01 00 00 00 00 00  |NTIME...........|
00000070  01 00 00 00 07 76 69 65  77 65 72 73 00 00 00 00  |.....viewers....|
00000080  01 01 00 00 00 00 00 00  00 00 00 00 00 00 00 08  |................|
00000090  63 68 69 6c 64 72 65 6e  00 00 00 00 01 01 01 00  |children........|

Almost identical to the first version, which is helpful for identification, but if we need to identify based on version, it might prove a little more difficult. It appears all the samples I have and have seen reference to all begin with the same 5 hex values, A24B657947, 0xA2 KeyG. It’s hard to know what other hex values might have something to do with versions of the file format. More samples could tell us, but from what I have the 20 bytes starting from offset 12 seems to be consistent among the different version samples. But for now the 5 bytes at the beginning of the file should suffice for identification.

When Final Cut Pro went through a complete re-write in 2011, the FCP format was abandoned. Not only made obsolete, but completely unsupported. The new Final Cut Pro X software was not able to support this now obsolete format. The new format followed the pattern of many other Apple formats of using a folder identified through an extension as a single file. Called a bundle format, Final Cut Pro X used the extension, .FCPBUNDLE. This bundle could include the media assets along with project settings/thumbnails and clips. Because of this “bundle” format, identification would have to be done at the individual file level inside the bundle. This would include formats with extensions such as .flexolibrary and .fcpevent, which appear to be SQLite databases. This complex format makes preservation of this type of object difficult with current methods and practices.

Luckily Apple didn’t leave Final Cut Pro users completely unable to migrate their content. Final Cut Pro could export the project as an XML file. This format is called Final Cut Pro XML Interchange Format and was well documented. The format was not made to bridge the gap from Final Cut Pro to Final Cut Pro X, but rather make the project file more useful outside of Final Cut Pro. Final Cut Pro X actually can’t open these files either, which is why a third party developer came in and developed 7toX (SendtoX) to allow for projects to be converted to a newer XML format.

Lets take a look at the basic Final Cut Pro XML Interchange Format which has a standard XML extension:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xmeml>
<xmeml version="5">
<sequence id="Sequence 1 ">...</sequence>
</xmeml>

Standard XML with a Doctype/root of xmeml. Clever. A little ways into the XML we also see:

<appspecificdata>
	<appname>Final Cut Pro</appname>
	<appmanufacturer>Apple Inc.</appmanufacturer>
	<appversion>7.0</appversion>
</appspecificdata>

Final Cut Pro X also has an XML format which is different than XMEML and has an extension FCPXML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE fcpxml>

<fcpxml version="1.8">
    <resources>
        <format id="r1" name="FFVideoFormatDV720x480i5994" frameDuration="2002/60000s" fieldOrder="lower first" width="720" height="480" paspH="10" paspV="11" colorSpace="6-1-6 (Rec. 601 (NTSC))"/>
    </resources>
    <library location="file:///Untitled.fcpbundle/">...</library>
</fcpxml>

A different Doctype/root and structure but should be easy to identify.

The preservation of projects files, according to some, is not necessary since they are not the finalized product. Preserving the finalized output would be preferable as it can be managed easier and represent the final render of a project. But identification of the Final Cut Pro project and all the assets gives the option to access a collection more accurately. I was able to create a signature for the FCP, XML, and FCPXML formats. Take a look on my GitHub for the signatures and some test files.