SDIF

I have used and have researched a lot of audio editing software. Some are very simple and straightforward, others are feature rich and take some time to learn. While looking in a format, I came across some Audio software which nothing like I have used before. At first I was confused, I figured it would be simple to open a certain file format and play the audio. Not so fast.

Max is software which proudly says it is an, “infinitely flexible space to create your own interactive software”. Created by Cycling ’74 software, Max has been around for awhile, being developed in the mid 1980’s. It allows the user to make “patches” stringing around components and effects to accomplish an infinite amount of options and outcomes.

The software produces simple project files and patch files, but hey are just JSON data, at least in the latest version. But when working with audio files the software can save to a number of formats.

One of the options is a format called “SDIF”, which stands for “Sound Description Interchange Format“. SDIF was jointly developed by IRCAM and CNMAT, with proposals starting back in the mid-1990’s. Originally written as a Spectral Description, it was later changed to refer to a Sound Description.

The Specification states the general idea was to “store information related to signal processing and specifically of sound, in files, according to a common format to all data types. Thus, it is possible to store results or parameters of analyses, syntheses…” So not exactly the same as a simple WAVE file you can open and edit, this format was meant to store signal data for analysis.

Each SDIF file consists of a header and then an overall a succession of frames, not unlike chunks in the IFF/AIFF/RIFF formats, ordered in time. Each frame matrix declares a “Type” which can be a combination of many options. Lets take a look at a SDIF file:

hexdump -C test.sdif | head
00000000 53 44 49 46 00 00 00 08 00 00 00 03 00 00 00 01 |SDIF............|
00000010 31 54 52 43 00 00 00 20 00 00 00 00 00 00 00 00 |1TRC... ........|
00000020 00 00 00 01 00 00 00 01 31 54 52 43 00 00 00 04 |........1TRC....|
00000030 00 00 00 00 00 00 00 04 31 54 52 43 00 00 00 c0 |........1TRC....|
00000040 3f 74 7a e1 40 00 00 00 00 00 00 01 00 00 00 01 |?tz.@...........|
00000050 31 54 52 43 00 00 00 04 00 00 00 0a 00 00 00 04 |1TRC............|
00000060 3f 80 00 00 45 95 35 c3 00 00 00 00 00 00 00 00 |?...E.5.........|
00000070 40 00 00 00 46 06 e2 14 00 00 00 00 00 00 00 00 |@...F...........|
00000080 40 40 00 00 45 3b 42 3d 00 00 00 00 00 00 00 00 |@@..E;B=........|
00000090 40 80 00 00 43 5d 94 7b 00 00 00 00 00 00 00 00 |@...C].{........|

This test file has the opening frame “SDIF“, to identify it as an SDIF, then a reference to the type “1TRC. I would try and explain a Matrix 1TRC Sinusoidal Track, but I have no idea what it means. Something, something sine wave, etc. Someone much smarter than me can make use of this format. Here are a couple examples of SDIF with other frame types.

hexdump -C angry_cat.part.sdif| head
00000000 53 44 49 46 00 00 00 08 00 00 00 03 00 00 00 01 |SDIF............|
00000010 31 4e 56 54 00 00 00 88 ff ef ff ff ff ff ff ff |1NVT............|
00000020 ff ff ff fd 00 00 00 01 31 4e 56 54 00 00 03 01 |........1NVT....|
00000030 00 00 00 61 00 00 00 01 53 74 72 65 61 6d 49 44 |...a....StreamID|
00000040 09 30 0a 44 61 74 65 09 54 68 75 5f 41 75 67 5f |.0.Date.Thu_Aug_|
00000050 5f 33 5f 32 31 2e 33 32 2e 34 35 5f 32 30 30 30 |_3_21.32.45_2000|
00000060 5f 0a 54 61 62 6c 65 4e 61 6d 65 09 53 69 6e 75 |_.TableName.Sinu|
00000070 73 6f 69 64 61 6c 54 72 61 63 6b 73 0a 57 72 69 |soidalTracks.Wri|
00000080 74 74 65 6e 42 79 09 50 6d 5f 56 65 72 73 69 6f |ttenBy.Pm_Versio|
00000090 6e 5f 31 2e 32 2e 32 0a 00 00 00 00 00 00 00 00 |n_1.2.2.........|

hexdump -C cymbalum-c4.res.sdif| head
00000000 53 44 49 46 00 00 00 08 00 00 00 03 00 00 00 01 |SDIF............|
00000010 31 52 45 53 00 00 0d 20 00 00 00 00 00 00 00 00 |1RES... ........|
00000020 00 00 00 04 00 00 00 01 31 52 45 53 00 00 00 04 |........1RES....|
00000030 00 00 00 d0 00 00 00 04 42 49 27 7a 39 59 fc ab |........BI'z9Y..|
00000040 3d 35 06 c9 00 00 00 00 42 6e 68 68 39 63 99 b1 |=5......Bnhh9c..|
00000050 3e 25 f7 c0 00 00 00 00 42 c6 02 bb 39 8c 31 79 |>%......B...9.1y|
00000060 3f bb 7e 6e 00 00 00 00 43 01 82 96 3a 1d 36 44 |?.~n....C...:.6D|
00000070 3e d9 21 12 00 00 00 00 43 07 35 f0 3a 20 6f 6e |>.!.....C.5.: on|
00000080 3f 02 32 7f 00 00 00 00 43 30 84 0b 39 97 f9 1b |?.2.....C0..9...|
00000090 3e c6 43 c7 00 00 00 00 43 4d e4 e4 39 88 14 90 |>.C.....CM..9...|

Unfortunately, the common tools I use to explore AV formats don’t seem to work on this format. MediaInfo, FFProbe, Exiftool, all give me unknown file warnings. So I had to compile the SDIF software in order to get some details.

querysdif angry_cat.part.sdif 
Header info of file angry_cat.part.sdif:

Format version: 3
Types version: 1

Ascii chunks of file angry_cat.part.sdif:

1NVT
{
StreamID 0;
Date Thu_Aug__3_21.32.45_2000_;
TableName SinusoidalTracks;
WrittenBy Pm_Version_1.2.2;
}

Data in file angry_cat.part.sdif (9504872 bytes):
1933 1TRC frames in stream 0 between time 0.000000 and 5.794875 containing
1933 1TRC matrices with 45 --400 rows, 4 -- 4 columns

An interesting thing is that a SDIF file can be in text form as well.

sdiftotext test.sdif 
SDIF


SDFC

1TRC 1 1 0
1TRC 0x0004 0 4

1TRC 1 1 0.005
1TRC 0x0004 10 4
1 4774.72 0 0
2 8632.52 0 0
3 2996.14 0 0
4 221.58 0 0
5 1943.02 0 0
6 123.951 0 0
7 6705.04 0 0
8 4304.97 0 0
9 3554.29 0 0
10 23.7822 0 0

1TRC 1 1 0.01
1TRC 0x0004 10 4
1 4774.72 0.0353114 2.06098
2 8632.52 0.00442518 0.68795
3 2996.14 0.0238517 -1.42295
4 221.58 0.0089712 -2.44141
5 1943.02 0.00768914 2.64629
6 123.951 0.0397061 -0.17527
7 6705.04 0.0245643 -0.168753
8 4304.97 0.00894803 1.45553
9 3554.29 0.0265175 2.57231
10 23.7822 0.0419019 -2.17731

1TRC 1 1 0.2
1TRC 0x0004 10 4
1 2284.56 0.02781 2.47054
2 4222.62 0.0151738 1.55309
3 31.1554 0.00421461 -0.657285
4 310.99 0.0122306 1.25794
5 215.192 0.0174093 1.25468
6 6253.69 0.000894192 2.21334
7 8533.32 0.0296167 2.07209
8 8044.77 0.0423002 2.54088
9 6087.45 0.0264733 -2.05523
10 7052.7 0.0287347 0.426339

1TRC 1 1 0.205
1TRC 0x0004 10 4
1 2284.56 0 0
2 4222.62 0 0
3 31.1554 0 0
4 310.99 0 0
5 215.192 0 0
6 6253.69 0 0
7 8533.32 0 0
8 8044.77 0 0
9 6087.45 0 0
10 7052.7 0 0

1TRC 1 1 0.21
1TRC 0x0004 0 4

ENDC
ENDF

An interesting format for sure. But wait, there is more!

My initial interest in this format was when I was given access to a set of MUBU files. I was unclear on how there were created at first and it took me down a long path of learning about SDIF and the Max software from Cycling ’74 and IRCAM. MUBU turns out to be a toolbox for Max which adds more analysis features.

MUBU stands for MUlti-BUffer, which helps overcome some limitations. It is actually a container using the SDIF standard. Lets take a look.

hexdump -C test.mubu | head
00000000 53 44 49 46 00 00 00 08 00 00 00 03 00 00 00 01 |SDIF............|
00000010 31 4e 56 54 00 00 00 78 ff ef ff ff ff ff ff ff |1NVT...x........|
00000020 ff ff ff fd 00 00 00 01 31 4e 56 54 00 00 03 01 |........1NVT....|
00000030 00 00 00 53 00 00 00 01 4d 75 42 75 2e 43 6f 6e |...S....MuBu.Con|
00000040 74 61 69 6e 65 72 2e 4e 75 6d 54 72 61 63 6b 73 |tainer.NumTracks|
00000050 09 31 0a 4d 75 42 75 2e 43 6f 6e 74 61 69 6e 65 |.1.MuBu.Containe|
00000060 72 2e 56 65 72 73 69 6f 6e 09 31 2e 35 0a 4d 75 |r.Version.1.5.Mu|
00000070 42 75 2e 43 6f 6e 74 61 69 6e 65 72 2e 4e 75 6d |Bu.Container.Num|
00000080 42 75 66 66 65 72 73 09 31 0a 00 00 00 00 00 00 |Buffers.1.......|
00000090 31 4e 56 54 00 00 00 38 ff ef ff ff ff ff ff ff |1NVT...8........|

A MUBU file has the same SDIF frame header, but also include a “1NVT” frame, which is a Name Value Table. This is where the MUBU container is referenced. The MuBu file has its own structure:

If I query the MuBu file like I did the SDIF, I get the following:

querysdif test.mubu
Header info of file test.mubu:

Format version: 3
Types version: 1

Ascii chunks of file test.mubu:

1NVT
{
MuBu.Container.NumTracks 1;
MuBu.Container.Version 1.5;
MuBu.Container.NumBuffers 1;
}
1NVT
{
MuBu.Buffer.Index 0;
}
1NVT
{
MuBu.Track.MxRows 2;
AudioFile 1;
MuBu.Track.NonNumType 0;
MuBu.Track.MaxSize 93515;
meta_ISFT Lavf60.16.100;
MuBu.Track.Name mytrack;
MuBu.Track.BufferIndex 0;
MuBu.Track.SampleRate 48000;
FileName Wilhelm_Scream.wav;
MuBu.Track.MxVarRows 0;
MuBu.Track.MxCols 1;
meta_MetaDataSource WAV;
MuBu.Track.EndTime 1623.5;
FilePath /;
MuBu.Track.SampleOffset 0;
MuBu.Track.TimeTags 0;
MuBu.Track.Size 77929;
MuBu.Track.Index 0;
}

1TYP
{
1MTD M000 {unnamed}
1FTD M000
{
M000 Track-0-MatrixData;
}
}

Data in file test.mubu (3741392 bytes):
77929 M000 frames in stream 0 between time 0.000000 and 1.623500 containing
77929 M000 matrices with 2 -- 2 rows, 1 -- 1 columns

The MuBu file contains one audio track and one buffer. This is a simple test file, but MuBu files can be quite large with multiple tracks.

Working with the Max software or OpenMusic is not something I found to be easy to understand. I am sure if I was more musically inclined and with a little practice I could make some of this work. For the time being, a signature to identify a SDIF and MUBU will have to do. Check out the GitHub for my proposed signature and a couple examples.

Shorten

I was recently going through some of my old CD-R’s and came across this 11 year old fun memory.

I remember going to this 2003 Toad the Wet Sprocket concert in Salt Lake City with some friends, I had seen this band perform before, but this was the first time I was able to get a recording of the show. Normally having a recording of a concert of a well known band was a little shady, but for some bands, they not only allow recording of their live concerts, but they encourage it. There has been a few bands over the years who have this philosophy, one most have heard of is the Grateful Dead, because of all the tape trading, the band’s numerous concerts will live on forever.

The scene of recording concerts is still alive and well, and if you are into recording and sharing it is expected you share in a lossless audio format. The world of lossless audio is definitely in the minority of all those who listen to music on the daily. Most of us have been placated with the infinite playlists on services like Apple Music, Spotify, and Amazon Music. Most probably don’t care about owning music anymore, but for the few who consider themselves Audiophiles, having a lossless audio file is the only choice.

When it comes to formats, there are a few lossless formats to choose from, they all come with some advantages as well as some downsides. WAV files contain the full PCM audio stream, and while internet bandwidth today can handle full uncompressed audio, it can still be beneficial to use some compression for archiving or sharing over the web.

The most common lossless format today is the Free Lossless Audio Codec or FLAC, but there are also quite a few who like the Apple Lossless Audio Codec. Both offer many advantages, especially with metadata, cuesheets, and can contain cover album art. But many years ago another lossless format was most often used with bootleg recordings and audio sharing.

Shorten was one of the first lossless formats, developed by Tony Robinson in 1993 for SoftSound. It could cut the size in half of a typical 16-bit WAV file. It achieved this by using Huffman coding, kinda the same way a JPEG works, by reducing the frequency of how often patterns occur. Today FLAC and ALAC have replaced this format and offer improved features and support. Many audio players have dropped support for shorten making it difficult to use this old format.

The Shorten format uses the .SHN extension. It is one of the formats listed on the Library of Congress Sustainability of Digital Formats with the ID fdd000199, although a couple links don’t appear to work as it hasn’t been updated since 2011. Support was ended for this format and many of the links found on various websites are for broken, usually referencing the etree wiki. Much of which is archived on the Internet Archive.

Let’s take a look at the what makes up a lossless compressed SHN file. A quick look at a sample header:

hexdump -C test.shn | head
00000000  61 6a 6b 67 02 fb b1 70  09 f9 25 59 52 a4 d1 a8  |ajkg...p..%YR...|
00000010  dd cf 85 5a 01 57 a0 d5  a8 b6 6b 6d d2 41 10 80  |...Z.W....km.A..|
00000020  40 20 10 18 04 0a 01 44  d6 40 20 11 0d 8c 0a 01  |@ .....D.@ .....|
00000030  04 80 44 20 16 4b 0d d2  c3 b8 f8 55 a0 11 80 59  |..D .K.....U...Y|
00000040  98 56 1d b1 79 51 9f 39  f1 12 d2 d3 75 5c cd 08  |.V..yQ.9....u\..|
00000050  06 25 68 6b 52 5e 9f 4c  39 cd c1 32 c4 0d a9 b7  |.%hkR^.L9..2....|
00000060  69 34 56 f0 96 fa 46 89  a2 6e 8c ba d5 d0 58 de  |i4V...F..n....X.|
00000070  f5 44 5b aa 61 82 c7 85  88 37 d6 ee cb ab 4e 44  |.D[.a....7....ND|
00000080  91 19 b7 38 d4 20 ae 98  98 d1 2c 4a 4e 88 dd 3e  |...8. ....,JN..>|
00000090  36 68 1b 59 a8 7d 84 23  76 0a 84 21 a1 cd 80 8e  |6h.Y.}.#v..!....|

The first four bytes seem to be consistent among my samples. It makes me wonder if the ascii values have something to do with the author, Anthony (Tony) J. Robinson. In the source code for the shorten software, the file shorten.h defines the ascii “ajkg” as the magic header for the SHN format. Also found in current ffmpeg code. Although the tools don’t have much to say about them.

mediainfo test.shn 
General
Complete name                            : test.shn
Format                                   : Shorten
Format version                           : 2
File size                                : 3.17 MiB

Audio
Format                                   : Shorten
Compression mode                         : Lossless

ffprobe -i test.shn
Input #0, shn, from 'test.shn':
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0: Audio: shorten, 44100 Hz, 2 channels, s16p

Using the older SHNTOOL, we can get more information.

shntool info test.shn  
-------------------------------------------------------------------------------
File name:                    test.shn
Handled by:                   shn format module
Length:                       0:32.23
WAVE format:                  0x0001 (Microsoft PCM)
Channels:                     2
Bits/sample:                  16
Samples/sec:                  44100
Average bytes/sec:            176400
Rate (calculated):            176400
Block align:                  4
Header size:                  44 bytes
Data size:                    5697720 bytes
Chunk size:                   5697756 bytes
Total size (chunk size + 8):  5697764 bytes
Actual file size:             3325489
File is compressed:           yes
Compression ratio:            0.5836
CD-quality properties:
  CD quality:                 yes
  Cut on sector boundary:     no
  Sector misalignment:        1176 bytes
  Long enough to be burned:   yes
WAVE properties:
  Non-canonical header:       no
  Extra RIFF chunks:          no
Possible problems:
  File contains ID3v2 tag:    no
  Data chunk block-aligned:   yes
  Inconsistent header:        no
  File probably truncated:    unknown
  Junk appended to file:      unknown
  Odd data size has pad byte: n/a
Extra shn-specific info:
  Seekable:                   yes

Many Shorten Audio Files are found out there in archives and file sharing sites, so even though the format isn’t used to create new files, it will still be around for awhile. My GitHub has my signature proposal and a couple of samples.