Not to be confused with Fantasia, a magical screen recording tool has been around for many years. Books have been written on the use of this software to instruct others on how to teach and demonstrate other software and ideas.
Unlike Fantasia, the screen recording software Camtasia was not made by Disney, but does contain some proprietary data. Camtasia is a screen recording software by the developer TechSmith. First released in 2002, it was available first for Windows and much later, Macintosh.
The first versions of Camtasia would encode screen recordings in an AVI container, using the TSCC codec. The TSCC codec, aka TechSmith Screen Capture Codec, was developed by TechSmith and the codec was distributed freely. Let’s see what MediaInfo knows about it.
mediainfo Camtasia1-s01.avi General Complete name : Camtasia1-s01.avi Format : AVI Format/Info : Audio Video Interleave Format settings : BitmapInfoHeader File size : 1.66 MiB Duration : 2 s 333 ms Overall bit rate : 5 966 kb/s Frame rate : 15.000 FPS
Video ID : 0 Format : TechSmith Codec ID : tscc Codec ID/Info : TechSmith Screen Capture Duration : 2 s 333 ms Bit rate : 87.3 kb/s Width : 320 pixels Height : 240 pixels Display aspect ratio : 4:3 Frame rate : 15.000 FPS Bit depth : 8 bits Bits/(Pixel*Frame) : 0.076 Stream size : 24.9 KiB (1%)
The AVI video format was the default recording format for the first couple versions. In version 3 the default format changed to the proprietary CAMREC format.
Camrec video files are a proprietary TechSmith file format that is used to store multiple files and information in a single package. Overall, .camrec files store your screen and camera recording plus some meta data about the various streams. However, it is important to note that you cannot view or play .camrec files outside of Camtasia Studio.
The CAMREC video format isn’t entirely proprietary and uses a common container.
Scanning the drive for archives: 1 file, 4696576 bytes (4587 KiB)
Path = Camtasia3-s01.camrec Type = Compound ERRORS: Unexpected end of archive Physical Size = 4698112 Extension = compound Cluster Size = 4096 Sector Size = 64
Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ ..... 3912 3968 manifest.camxml ..... 4672000 4673536 Screen_Stream.avi ------------------- ----- ------------ ------------ ------------------------ 4675912 4677504 2 files
The CAMREC file might be unknown to most video players, but the AVI within the compound object is the same as the versions before it. Camtasia even has a built in extractor if you really need to pull the AVI out of the format.
Each CAMREC file contains a manifest.camxml. They seem to be UTF-16 XML files, with and without the XML declaration. The Screen_Steam.avi file seems to be in all my samples, but not clear if there can be a variant without an AVI file.
This CAMREC container was used in the Camtasia Studio software until version 8.4 when the default was changed to a new Codec, based on MPEG4, with the TREC extension.
mediainfo capture-1.trec General Complete name : capture-1.trec Format : MPEG-4 Format profile : Base Media / Version 2 Codec ID : mp42 (mp42/isom) File size : 277 KiB Duration : 3 s 41 ms Overall bit rate mode : Variable Overall bit rate : 746 kb/s Frame rate : 19.091 FPS Encoded date : 2025-02-11 03:48:25 UTC Tagged date : 2025-02-11 03:48:34 UTC FileExtension_Invalid : braw mov mp4 m4v m4a m4b m4p m4r 3ga 3gpa 3gpp 3gp 3gpp2 3g2 k3g jpm jpx mqv ismv isma ismt f4a f4b f4v
Video ID : 1 Format : tsc2-D0 Codec ID : tsc2-D0 Duration : 2 s 933 ms Bit rate : 495 kb/s Width : 924 pixels Height : 696 pixels Display aspect ratio : 4:3 Frame rate mode : Variable Frame rate : 19.091 FPS Minimum frame rate : 10.000 FPS Maximum frame rate : 30.000 FPS Bits/(Pixel*Frame) : 0.040 Stream size : 177 KiB (64%) Title : 100 Encoded date : 2025-02-11 03:48:25 UTC Tagged date : 2025-02-11 03:48:34 UTC
TechSmith Recording File (TREC) files will identify as an MP4 in most identification tools, you will need MediaInfo or other tools to understand the codec used. If we look at the header of the MP4 TREC file:
We see the standard header for an MP4 file. The codec specific to the Camtasia software is identified later in the file, but identification using a PRONOM signature might be challenging. In looking at the hex of the file, near the end, you can find embedded PNG’s and other data. VLC and FFMPEG can read the codec, but players like Quicktime struggle.
A promising section near the end shows the name and version of Camtasia Studio. More data needed.
Camtasia also uses a lot of Project files to managing the video editing process of your screen recordings. The project files can vary between the Windows and Macintosh versions.
The older versions of Camtasia for Windows up until version 8.4, used the CAMPROJ extension for their projects. These are in XML and simply use “<Project_Data>” for the root element. With Version 8 having a later element “<CSMLData>” to manage the assets. Other projects also have a File element that begins with either “tscrec4://” or “TSCRec://”. But it may be best to identify the older versions with the “<ClipBin_Array>” element.
For Mac version 2, they used CMPROJ for the Project, but also it was an Apple Bundle/Package file. It also used a recording file with the extension CMREC, but is also Apple Bundle/Package file which contains MOV and DAT files.
The most recent versions of Camtasia for Mac and windows use the TSCPROJ extension. They are plan text files with some resemblance of JSON.
There are a few formats related to Camtasia, but the CAMREC format is the one that shows up the most in my work. So today I am only proposing a signature for CAMREC and the CAMPROJ formats. We will have to have some discussion on the TREC format to determine if standard MPEG-4 identification is fine or if the format needs its own PUID. You can find some examples and my proposed signature on my Github page.
Microsoft is never in short supply of file formats. They have made many changes over the years. Introduced lots of products, some lasting longer than others. The list is quite long.
One such software was called Office Binder. Introduced with Office 95, it was a companion application to combine a number of OLE objects together in one “Binder”. Meant to be the digital version of an Office Binder one often uses for presentations or proposals.
You could add sections and include Word documents, Images, Powerpoint, Excel spreadsheets, basically any OLE object. Of course a Binder file itself was an OLE compound object. They had the extension OBD, and templates used OBT. The PRONOM registry has PUID’s for the different Binder versions, but there are some issues.
filename : 'Binder95-s01.obd' filesize : 5120 modified : 2024-08-08T21:24:34-06:00 errors : matches : - ns : 'pronom' id : 'fmt/240' format : 'Microsoft Office Binder File for Windows' version : '97-2000' mime : class : basis : 'extension match obd; container name Binder with name only'
Turns out only one of the PRONOM PUID’s has an actual signature, the others are placeholders. So when I run Siegfried on an Office Binder 95 file, it comes back as fmt/240 which points to an Office Binder 97-2000 file. It’s a simple signature, looking for an internal file named “Binder”, which is inherent of all the Binder file types.
<ContainerSignature Id="5500" ContainerType="OLE2"> <Description>Microsoft Office Binder File for Windows 97-2000</Description> <Files> <File> <Path>Binder</Path> </File> </Files> </ContainerSignature>
Taking a look inside the Office 95 Binder file, we can see the “Binder” file.
It looks like version 97 and 2000 have an extra file. The “HdrFtr” file seems to reference a Header and Footer, which according to documentation was a feature added in Office 97.
What’s new in Office Binder 97
Office Binder makes it possible for you to group all your documents, workbooks, and presentations for a project in one place. To get started with Office Binder 97, add a new or existing document to your binder. Use the new Office 97 features while you work in a binder……. Print headers and footers for a binder
We can use the “HdrFtr” file within the container to differentiate between the 95 version and 97-2000 formats. Perhaps, a closer look at the DocumentSummaryInformation file in the future, might help with a more precise identification later. There doesn’t seem to be anything to distinguish an OBD file from a OBT template file, so those PUID’s may not be needed. The other format related to the Binder software has the OBZ extension. It is called a Wizard template file in some documentation, but I have been unable to find any type of “Wizard” functionality in the Office Binder Apps to generate a file. The OBZ format seems to have something to do with macros in Visual Basic. Luckily there are a few examples available on Office install disc‘s.
Sure enough, the OBZ file has a Visual Basic macro (VBA_Project). Unfortunately, it appears to be nested in an additional folder within the container, with a variable number number which is likely to change from file to file. That fact will make identification in PRONOM much more difficult, as the signatures are not designed for variable names. Possibly something we can investigate later.
Microsoft Binder was only released in Office 95, 97, and 2000, but was supported in Office XP and 2003 through an UNBIND.EXE application which would simply separate all the different objects back out to the individual files.
The Microsoft Office Binder is not included in Office 2003. However, if a Binder file created in a previous version of Office contains information you want to access, you can use the Unbind tool to pull out the information and save it in the formats of the appropriate programs. In order to do this procedure, the Unbind tool must be installed.
As always, you can look at some sample files and my proposal for updated signatures on my GitHub page.
Researching file formats isn’t for everyone. Others may find it boring or even odd. Trying to explain to others the nuances of a binary format versus a container format would bring many tears. Their reactions sometimes are similar to hearing someone explain their belief in aliens. Passionate, but a bit on the crazy side.
So with aliens and containers on my mind, let’s take a look at a format with the extension UFO. It is not an unidentified flying object or a UAP, it may as well be an unidentified file object, but in this case it is a “Ulead File for Objects” format. It is the exclusive file format for use with the PhotoImpact software from Ulead Systems, a Taiwanese developer known for many popular software programs. First released in 1996 with version 3, the PhotoImpact software was marketed as “a fully object-based tool, which pioneered a number of important innovations“.
The reason it was a considered a full object-based tool was the UFO format is based on the, at the time, popular OLE Compound File Storage object format developed by Microsoft. So by using some OLE tools we can take a closer look at some of these Unidentified File Objects……..
oleid Sample.ufo oleid 0.60.1 - http://decalage.info/oletools THIS IS WORK IN PROGRESS - Check updates regularly! Please report any issue at https://github.com/decalage2/oletools/issues
Filename: Sample.ufo --------------------+--------------------+----------+-------------------------- Indicator |Value |Risk |Description --------------------+--------------------+----------+-------------------------- File format |Generic OLE file / |info |Unrecognized OLE file. |Compound File | |Root CLSID: - None |(unknown format) | | --------------------+--------------------+----------+-------------------------- Container format |OLE |info |Container type --------------------+--------------------+----------+-------------------------- Encrypted |False |none |The file is not encrypted --------------------+--------------------+----------+-------------------------- VBA Macros |No |none |This file does not contain | | |VBA macros. --------------------+--------------------+----------+-------------------------- XLM Macros |No |none |This file does not contain | | |Excel 4/XLM macros. --------------------+--------------------+----------+-------------------------- External |0 |none |External relationships Relationships | | |such as remote templates, | | |remote OLE objects, etc --------------------+--------------------+----------+--------------------------
Well, it is a OLE file, but is unrecognized/unidentified by the oletools software. It also appears to be missing the root entry and CLSID you commonly find in OLE files. Since this is an OLE container we can also just use 7zip to peek inside as well.
In this sample file, we have a bunch of directories and objects, but none of what we expect to see in an OLE file, such as a “SummaryInformation” or “DocumentSummaryInformation” like we would see in a Word DOC file. By not having the standard contents of the container, it makes these files very specific to PhotoImpact software.
Here is another UFO file from the last version of the software PhotoImpact X3 when it was owned by Corel, but phased out in 2009. This is the basic file structure with no objects added to the file. We can be fairly confident these are the base files in most every other UFO file. It doesn’t have any of the “OS” folders which contain the objects, so I think the LtfHeader file might be our best bet for a signature. Let’s take a look at the Hex values for a few of them.
Making a signature using the first 8 bytes of the LtfHeader file appears to have worked for all the 3,400+ sample files I have collected. Problem is it also worked for another extension found in the some of the later versions of PhotoImpact.
When you have successfully finished your template, make sure to save it in the Ulead File For Photo Project format (*.UFP). This allows you to open and use your template in the Photo Projects dialog box. In the Template tab, click Open Project and browse for the created file.
They appear to be a template version for the format so we should be fine just adding the extension to the same signature.
Well, this Unidentified File Object is no longer unidentifiable. Was it sent by aliens? Possibly, but at least we know where these UFO’s came from, PhotoImpact. Take a look at the samples and proposed signature in my GitHub.
The Digital Preservation Coalition recently released their tech watch report on Preserving Geospatial Data. This adds to reports on CAD, Construction, and others. One of the many areas of difficulties in Digital Preservation is understanding these areas of GIS, CAD, and 3D Modeling software and the file formats which belong to the software titles in this space. Not only are the file formats plentiful but the software is extensive and expensive. Documentation is lacking in understanding the different file formats associated with each software title. These tech watch reports are super useful, but more is needed to enhance the tools we use to better identify, validate, and transform these formats in order to preserve them long term.
I was processing some data sets from a recent collection added to our Scholarly repository and came across some models in the SolidWorks part format. I was surprised to find that this format has been around since 1995 and has yet to be added to the PRONOM registry.
SolidWorks is mechanical design software used for making 3D models which can be made to be individual parts, part of larger assemblies and added to drawings giving engineers access to 3D deisgn on their desktops. Bought by Dassault Systèmes in 1997, they are the makers of the CATIA CAD software. Since 1995 a new version was released almost every year, adding new features and improvements to the format. The original versions made use of the Microsoft OLE object container, but in 2015 the format shifted to a proprietary binary format. Let’s take a look at some samples.
There are three types of SolidWorks file formats, the SolidWork part (sldprt), the assembly (sldasm), and drawing (slddrw). The first versions of SolidWorks used prt, asm, and drw, but quickly added “sld” to avoid confusion with other CAD tools.
We can see this file is a compound (OLE) container file. It’s very useful to have a directory within the container with a version number. With this version number we can use the chart on the file format wiki to see this file was last modified by SolidWorks 97 Plus. The problem comes in when we look at an assembly file and compare.
Almost the same contents, the same version directory. The only difference in content is the file Defaults in the Contents directory. But hard to know if all have the same difference. We will have to look closer at the individual files to hopefully find what sets the different formats apart.
The SolidWorks 2000 format added additional files to the container which can help.
Starting in 2015 the format changed from an OLE container, to a binary file. Here is what the first few bytes look like from a 2015 file and a later 2023 file:
The newer version of the format is much different and is in a proprietary binary format with no specifications, which makes it much more difficult to know which parts of the file can be used for identification. All these new formats have the hex values “00 00 00 04” as bytes 4 through 7. Not very unique for identification. There is another set of bytes which does seem to be consistent for all samples so far, but they vary in their location. The values “34 f6 e6 47 56 e6 47 37 f2” seem to be in every sample. The 10th byte often has the value 34, but in many samples either has 34, B4, 44, 64, or 33. The other formats, SLDASM and SLDDRW also have this pattern which might give us enough to make a good signature. At this time we may not be able to distinguish the different formats, but maybe in the future.
More work is needed to really develop signatures that can identify each format from SolidWorks definitely. My initial assumptions we not completely correct and there are a few exceptions to the patterns I felt were good enough. One unknown is the formats from SolidWorks 95 through 99 and properly identifying them. More samples are needed. I have placed my initial signature and some samples on my GitHub. Please get in tough if you have additional samples or ideas on better identification.
I had access to my first Macintosh computer around 1987. My father brought it home and I spent hours on it playing games and occasionally writing reports for school. The Macintosh Plus computer had one floppy drive and no hard drive. I remember playing the game Orbiter which had two floppy disks and right in the middle of game play it would pause and ask me to insert disk 2, then quickly ask for disk 1 again. The struggle was real. I spent years using many different Macintosh computers and now own more than I wish to admit. I’m preserving them!
The wild world of digital preservation has been a little lacking on the Macintosh side of things as I have come to realize. There still not a great way to manage Resource Forks in many preservation systems and the identification tools are mainly focused on the data bytetreams and not any system specific attributes Macintosh used often.
The PRONOM registry has either referenced early Macintosh specific formats or missed them entirely so I have been slowly working on a few to close that gap.
Interestingly enough, many Microsoft programs initially made their GUI debuts on the early Macintosh before making their way to Windows. Excel is one I am working on, as Version 1 is not identifiable in PRONOM, it was Macintosh only at the time.
Another is PowerPoint, I recently submitted two new signatures to PRONOM.
fmt/1747: Microsoft PowerPoint Presentation v2.x. Full entry added.
fmt/1748: Microsoft PowerPoint Presentation v3.x. Full entry added.
fmt/1866: Microsoft Powerpoint for Macintosh v.2. Full entry added.
fmt/1867: Microsoft Powerpoint for Macintosh v.3. Full entry added.
PowerPoint was initially released in 1987 on the Macintosh platform. It was developed by a company called ForeThought. Version 1.0 on the Macintosh was under this name, until it was bought by Microsoft only three months after being released. The history of PowerPoint can be discovered at Robert Gaskins, one of the original developers, website and book he wrote. The available information provided by Microsoft is only for the OLE format, covering versions 4.0 until 2003.
So, lets take a look at the Powerpoint original file format, before OLE.
Type/Creator RF DF Date Filename
f SLDS/PPNT 0 932 Oct 10 19:10 PowerPoint-v1
Luckily the early PowerPoint files did not have a Resource Fork. The Data Fork, if you haven’t noticed, has an interesting set of hex values at the beginning of the file. 0BADDEED is the first 4 bytes. If we look at a PowerPoint version 2 file from Windows.
The file format is the same, but because of the weird world of endianness, the first few bytes are in reverse order, EDDEAD0B.
Obviously we need to discuss this magic number and the meaning behind “Bad Deed”. This question was asked previously by the digital preservation community. I have a previous blog post about the use of words for the magic number CAFEBEEF as it was used with with JAVA class files and Express Publisher in the 1990’s. BADDEED looks like another clever use of the hex values that formed words. But was there a story behind the words? Joe Carrano asked if this string might be hexspeak. I wanted to know more so I asked some one who might know.
Robert Gaskins was kind enough to chat with me for a bit about the early days of PowerPoint.
I had a theory on the possible meaning behind BADDEED, so I asked him what the feeling was like between Apple and Microsoft at the time. I had heard for years that PowerPoint was originally created for the Macintosh, but Robert informed me:
In fact, PowerPoint was designed first for Microsoft Windows,
and its first spec shows that: “All the screen shots, menus, and
dialogs were set up to look like Microsoft Windows, not like
Macintosh.” (Gaskins, Sweating Bullets, p. 92) You can see that
Of course, we turned out to have been right all along. PowerPoint on
Mac was much loved, but sales remained poor because Mac sales were
so poor. It was only after we shipped on Windows that PowerPoint gained
the dominant market share which has characterized it ever since, and
Windows PPT outsold Mac PPT very quickly. (Gaskins, Sweating Bullets, p. 403)
So my original thought was that there was some bad feelings around this Apple, Microsoft battle which has been the sentiment for quite some time. So when I asked if any of that influenced the use of BADDEED, I was told:
So, far from being disgruntled by expanding PowerPoint to Windows,
that had been our goal all along, and its achievement was the most
important success we had.
I judge that you are fully aware of all that, and that
your question is more, “was there any bad deed signified
by the Mac hex value chosen?” No, it was just the poverty
of choice when you only have six letters.
So there you have it. The use of the hex values 0x0BADDEED, was simply chosen from a limited set of values when looking at words hexadecimal could spell. I guess I should never let the truth get in the way of a good story.
I continued to have a wonderful conversation with Robert and also asked him for some details on the rest of the PowerPoint file format. I was hoping there might be some documentation out there explaining the early format before Microsoft took over. Robert said:
I don’t know of any such documentation apart from the official
Microsoft support files available online. I don’t have any such
information. I know that Dennis Austin deposited some of our
working files at the Computer History Museum (not online):
and it’s likely that some information is there–if nothing
else, it claims to contain a source code listing for PPT 1.0
which would contain the code to read the file format.
So there might be some information in at the Computer History Museum worth looking into.
As far as I could tell from the available online information, there is a few differences between Version 1.0 and Version 2.0, the biggest being the fact that 1.0 did not have an option to print in color, amount a few other minor things. Here is a screenshot of a page from the Microsoft PowerPoint 2.0 documentation on archive.org.
I suppose with the signature additions of the Macintosh and Windows versions 2.0 and 3.0 of the PowerPoint file format in PRONOM, that should cover most needs. Currently my PowerPoint 1.0 files identify at 2.0 files, so I may need to have them adjust the PUID to include both versions 1.0 and 2.0 as they are so similar. If I am able to find a difference or get my hands on the original source code I may find a better solution.