Receiving electronic media from an outside source can be an adventure. Often times you find yourself sorting the valuable files and separating them from the chaff. There can be hidden files, cache files, application files, drivers, and everything in between. Determining what formats are important can sometimes be difficult, especially if you don’t know the file format of some of the files.
I was recently working on a collection of files which had been produced through some audio software. When working with audio, a WAVE file is what is usually kept as they contain the actual audio data. With these files they came with a couple other formats. One of those formats was a bunch of SFK peak files. These files are meant to be temporary as they are generated from the WAVE file to make opening of audio data faster. They are important, but can easily be regenerated. One could argue they have historical value, but also they don’t contain anything that can be used by itself, so alone they don’t have much value.
The other format found with the WAVE files have a CDP extension. These came up as unknown when using DROID. It is not a common extension so finding the name of the software which created the files wasn’t too hard. Let’s take a look at one of them.
hexdump -C tutor1.cdp | head
00000000 52 49 46 46 79 03 00 00 53 46 50 4a 66 6d 74 20 |RIFFy...SFPJfmt |
00000010 18 00 00 00 00 00 01 00 02 00 00 00 10 00 00 00 |................|
00000020 44 ac 00 00 03 00 00 00 01 00 00 00 4c 49 53 54 |D...........LIST|
00000030 88 00 00 00 66 6c 73 74 66 69 6c 65 23 00 00 00 |....flstfile#...|
00000040 44 3a 5c 53 6f 75 6e 64 73 5c 4e 65 77 20 54 75 |D:\Sounds\New Tu|
00000050 74 6f 72 20 66 69 6c 65 73 5c 53 6f 6e 67 33 2e |tor files\Song3.|
00000060 77 61 76 00 66 69 6c 65 23 00 00 00 44 3a 5c 53 |wav.file#...D:\S|
00000070 6f 75 6e 64 73 5c 4e 65 77 20 54 75 74 6f 72 20 |ounds\New Tutor |
00000080 66 69 6c 65 73 5c 53 6f 6e 67 32 2e 77 61 76 00 |files\Song2.wav.|
00000090 66 69 6c 65 23 00 00 00 44 3a 5c 53 6f 75 6e 64 |file#...D:\Sound|
Huh, this is a RIFF file. RIFF is most commonly used as the container used for WAVE and AVI files. You can read more about the RIFF format on a previous post. The RIFF container format can be used for all sorts of things. Looking at the internals we can see a few unique list chunk’s.
Lots of references to other files, specifically WAVE files. But not a lot of actual data. That is because this format turns out to be just a project format for some software called “CD Architect“. Sonic Foundry was an audio software developer for a few years before they sold their catalog to Sony in 2003. In looking at the manual for CD Architect version 5.2, it explains the CDP Project format.
CD Architect software handles the organization of your CD using a small project file (CDP) that saves information about source file locations, edits, cuts, and insertion points. This project file is not a multimedia file, but is instead used to create the CD when editing is finished.
Looking at another CDP file from the collection, I noticed something different.
hexdump -C CDArch50a-s01.cdp | head
00000000 72 69 66 66 2e 91 cf 11 a5 d6 28 db 04 c1 00 00 |riff......(.....|
00000010 20 0a 00 00 00 00 00 00 84 38 15 b3 da 08 85 44 | ........8.....D|
00000020 b2 2a 5b 70 a1 32 15 ff 5a 2d 8f b2 0f 23 d2 11 |.*[p.2..Z-...#..|
00000030 86 af 00 c0 4f 8e db 8a 00 02 00 00 00 00 00 00 |....O...........|
00000040 78 00 00 00 00 00 04 00 11 00 00 00 44 ac 00 00 |x...........D...|
00000050 00 00 00 00 00 c0 52 40 00 00 00 00 00 00 5e 40 |......R@......^@|
00000060 00 00 00 00 00 00 00 00 04 00 04 00 40 00 00 00 |............@...|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 7c 00 00 00 |............|...|
00000080 50 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 |P...............|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
That’s odd, the RIFF format is always uppercase ASCII, this is lowercase. Also the important RIFF form, which was “SFPJ” in the other sample, is missing. This is not a valid RIFF format.
But further down in the file I can see the same list chunks. Did they take RIFF format and make a proprietary version of their own? I think they may have. It seems the first example was from CD Architect version 4 and these other files are from CD Architect version 5. That complicates things. Sony stopped developing CD Architect after version 5.2d and maintained it for a few years before selling many of their titles to MAGIX Software. As far as I know there was never any new versions released. The software was very popular, as it had some really nice audio mastering features and was easy to use. Many were upset when the software was abandoned.
Creating a signature for both version 4 and version 5 CDP files will be pretty straightforward. I feel knowing what you have in a collection you are processing is the first step in making informed decisions. Wether or not you keep the project files are up for debate. Some may only want the final audio created from a CD Architect project, while others may want to see the way the audio was put together and mixed. Either way, the more you know…..
One more thing. CD Architect would default to saving a CDP project file, but could also save a “CD Image file”. This process actually would save the project to a full WAVE file with some extras baked in.
An image file is essentially a wave file with volume, crossfades, effects, mixes, and track information embedded. Burning an image file will reduce the risk of buffer underruns (especially if you have a complex project or are using a slow computer) since no audio processing is required.
Interesting, normally when working with track information in a single WAVE file you would need a companion CUE Sheet in order to reference the track layout of the Audio CD. So I am curious how they do all of this. Lets take a look at a “CD Image”.
mediainfo CDArch52d-s02.wav
General
Complete name : CDArch52d-s02.wav
Format : Wave
Format settings : PcmWaveformat
File size : 5.05 MiB
Duration : 30 s 0 ms
Overall bit rate mode : Constant
Overall bit rate : 1 411 kb/s
Conformance errors : 2
RIFF : Yes
General compliance : File size 5292434 is less than expected size 5292823 (offset 0x8)
WAVE : Yes
General compliance : Element size 5292811 is more than maximal permitted size 5292422 (offset 0xC)
Audio
Format : PCM
Format settings : Little / Signed
Codec ID : 1
Duration : 30 s 0 ms
Bit rate mode : Constant
Bit rate : 1 411.2 kb/s
Channel(s) : 2 channels
Sampling rate : 44.1 kHz
Bit depth : 16 bits
Stream size : 5.05 MiB (100%)
Already seeing some issues with the format, but all the important bits are there. JHOVE doesn’t like them much either.
JhoveView (Rel. 1.32.0, 2024-09-12)
Date: 2024-12-11 16:01:08 MST
RepresentationInformation: CDArch52d-s02.wav
ReportingModule: WAVE-hul, Rel. 1.8.3 (2024-03-05)
LastModified: 2024-12-11 15:58:02 MST
Size: 5292434
Format: WAVE
Status: Not well-formed
SignatureMatches:
WAVE-hul
InfoMessage: Ignored unrecognized list type: "pqls"
ID: WAVE-HUL-15
Offset: 5292044
ErrorMessage: Unexpected end of file: Bytes missing = 389
ID: WAVE-HUL-3
Offset: 5292434
MIMEtype: audio/vnd.wave; codec=1
Profile: PCMWAVEFORMAT
JHOVE is giving me two issues. The major error is the file appears truncated according to both MediaInfo and JHOVE. The InfoMessage which is less of an issue but more of a heads up that the WAVE file has an extra LIST type. “PQLS”, which was also in the CPD RIFF file we looked at earlier. So it seems by making a “CD Image” of a project embeds the project chunk data into the WAVE container. Identification is not an issue as these WAVE’s follow the standard pattern and therefore identify correctly, but one might want to be aware through further characterization these WAVE’s have some not so obvious extra data.
My attempts to find any samples from version 3 of CD Architect have failed. Until then, my proposal is to add version 4 & 5 to PRONOM with the signature on my Github page. There you will find a few samples as well.