BE DEAD

If you remember the older post about Cafe Beef, you’ll appreciate the file format we explore in this post which uses using the Hex values “BE DEAD”. I guess they jinxed themselves because the software didn’t survive a refresh in 2009 and died. At one point the software was considered remarkable software being awarded 4.5 Mice by Macworld Magazine in August 2002.

When a colleague reach out to me recently with a file they were not familiar with I jumped in. I love a good challenge. The file had no extension, but was thought to have come from a Windows system. With a little digging I was able to identify the file as a Now Contact file which does have a Windows and Macintosh version, but with no extension, my money was on the file coming from the Mac.

I started my search with the obvious, the first few bytes. Since I only had one file, I wasn’t sure if this would be helpful, but looking at the bytes, I figured it was significant.

% hexdump -C "CONTACT FILE" | head
00000000 be de ad 01 00 00 00 03 00 1d 5f a9 00 7d d1 8e |.........._..}..|
00000010 00 00 0e d8 98 89 d7 4b 00 8f 31 fd 00 00 3b f0 |.......K..1...;.|
00000020 00 de 76 56 be de ad 00 e6 02 b4 af 63 64 62 68 |..vV........cdbh|
00000030 00 00 00 00 00 00 01 8e 90 8f 56 13 00 0d 09 b4 |..........V.....|
00000040 00 0d 0b e0 00 0d 0c 50 00 0d 0b e8 00 0d 0c 58 |.......P.......X|
00000050 00 00 00 00 ba f3 fa eb 00 00 27 12 00 00 00 00 |..........'.....|
00000060 e6 02 b4 76 00 00 00 00 00 00 00 00 00 00 00 00 |...v............|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|

The first three bytes are “BE DE AD“, BE DEAD seems to be done on purpose. A quick search on the web showed no results, no mention of this unique header. I even turned to AI, asking grok if it know the source of this byte sequence. It had no idea. I began digging through the file looking for clues to its software source. The ASCII text I could see indicated some sort of customer database, that along with the file name of “CONTACT FILE” seemed to confirm. I found some dates from 2002 and started looking at popular CRM and PIM software at the time. I then found a reference to a note the user left saying they opened the file on a different Power Tower Pro. I owned one of these clones back in college, so I immediately knew they were using a Macintosh! A quick search of popular contact management software from the early 2000’s revealed a few suspects. I took a look at a product from Now Software, Now Up-to-Date & Contact version 3.9 and I found the header I was looking for! Had the file sent to me retained its extended attributes from the Mac, I would have found this software much quicker.

Now Software has been around since 1990 and was purchased at one point by PowerOn Software. Now Up-to-Date came around in 1992, but Now Contact wasn’t added until 1994. Version 1.0 of the software was standalone and was popular, but simple, when Now Software bundled it with the Now Up-to-Date software in 1995, they skipped version 2 to be in sync.

The Now Contact software has a few functionalities including a Word Processor, but lets stick to the contact manager for now. Let’s take a look at a sample file from version 1.

% hexdump -C "Sample Contact File" | head 
00000000 00 00 4c 47 00 00 00 03 a8 ee f6 c6 a8 ef 9b 95 |..LG............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000100 12 bc 00 2b 73 74 61 74 00 00 00 01 00 00 00 04 |...+stat........|
00000110 00 00 00 fc 73 74 61 74 00 00 00 02 00 00 01 00 |....stat........|
00000120 00 00 06 44 4b 6e 44 42 00 00 00 00 00 00 07 44 |...DKnDB.......D|
00000130 00 00 04 68 4b 6e 44 42 00 00 00 01 00 00 0b ac |...hKnDB........|
00000140 00 00 06 12 4b 6e 44 42 00 00 00 02 00 00 13 ac |....KnDB........|
00000150 00 00 00 70 4b 6e 44 42 00 00 00 03 00 00 15 ac |...pKnDB........|
00000160 00 00 00 b6 4b 6e 44 42 00 00 00 05 00 00 16 64 |....KnDB.......d|

This file does not have the “BE DE AD” header, but something else. I do see a repeated pattern of the text “KnDB” which also happens to be the Type code used on the Macintosh.

% getfileinfo "Sample Contact File"
type: "KnDB"
creator: "NIC!"
attributes: avbstClinmedz
created: 10/23/1993 13:57:10
modified: 10/24/1993 11:18:58

Another sample

% hexdump -C NC1-s02 | head
00000000 00 00 36 20 00 00 00 03 e6 03 e7 1a e6 03 e7 41 |..6 ...........A|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000100 0c dc 00 26 73 74 61 74 00 00 00 01 00 00 00 04 |...&stat........|
00000110 00 00 00 fc 73 74 61 74 00 00 00 02 00 00 01 00 |....stat........|
00000120 00 00 06 44 4b 6e 44 42 00 00 00 00 00 00 07 44 |...DKnDB.......D|
00000130 00 00 04 68 4b 6e 44 42 00 00 00 01 00 00 0b ac |...hKnDB........|
00000140 00 00 00 00 4b 6e 44 42 00 00 00 02 00 00 0b ac |....KnDB........|
00000150 00 00 00 20 4b 6e 44 42 00 00 00 03 00 00 0b cc |... KnDB........|
00000160 00 00 00 b6 4b 6e 44 42 00 00 00 05 00 00 0c 84 |....KnDB........|

These version 1 files don’t seem to have a static header, but they do have common bytes sequences. I will need to make more samples to get a proper signature constructed.

Now Contact skipped version 2 so the next version to be released was 3.0. What do these files look like?

% hexdump -C "Sample Contact File" | head
00000000 be de ad 01 00 00 00 03 00 00 a7 62 00 00 03 fc |...........b....|
00000010 00 00 00 19 00 01 6e 9a 00 00 36 94 00 00 00 55 |......n...6....U|
00000020 00 01 42 ea be de ad 00 b4 25 c0 65 00 01 63 6b |..B......%.e..ck|
00000030 00 00 00 87 00 00 00 6c fc 63 8e a8 6f 62 6a 65 |.......l.c..obje|
00000040 00 00 00 81 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 01 00 00 00 80 00 00 00 36 63 6d 6e 74 00 00 |.........6cmnt..|
00000060 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 03 |................|
00000070 63 64 61 74 00 00 00 04 b4 25 c0 65 63 74 68 74 |cdat.....%.ectht|
00000080 00 00 00 00 63 73 65 6c 00 00 00 04 00 00 00 00 |....csel........|
00000090 be de ad 00 b4 25 c0 a4 44 4c 54 7a 00 00 00 ed |.....%..DLTz....|

They have the same header as the file I received. Let’s try and open my file in Now Contact 3.9.

Oops, that didn’t work. There must be something in my file which tells the software it is from a newer version. After some digging in the file I can see some possible version text.

% hexdump -C "CONTACT FILE"
00000000 be de ad 01 00 00 00 03 00 1d 5f a9 00 7d d1 8e |.........._..}..|
00000010 00 00 0e d8 98 89 d7 4b 00 8f 31 fd 00 00 3b f0 |.......K..1...;.|
00000020 00 de 76 56 be de ad 00 e6 02 b4 af 63 64 62 68 |..vV........cdbh|
00000030 00 00 00 00 00 00 01 8e 90 8f 56 13 00 0d 09 b4 |..........V.....|
00000040 00 0d 0b e0 00 0d 0c 50 00 0d 0b e8 00 0d 0c 58 |.......P.......X|
00000050 00 00 00 00 ba f3 fa eb 00 00 27 12 00 00 00 00 |..........'.....|
00000060 e6 02 b4 76 00 00 00 00 00 00 00 00 00 00 00 00 |...v............|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000003a0 00 50 00 21 00 f1 01 cf 00 00 00 82 00 00 00 34 |.P.!...........4|
000003b0 6b 65 79 73 00 00 00 82 00 00 00 00 00 00 00 00 |keys............|
000003c0 00 01 00 00 00 01 6b 65 79 68 00 00 00 15 76 34 |......keyh....v4|
000003d0 30 30 00 00 00 01 00 00 00 00 00 00 00 00 00 00 |00..............|
000003e0 00 00 00 00 be de ad 00 ba 42 cd 51 05 00 64 62 |.........B.Q..db|
000003f0 00 00 00 02 00 00 00 18 00 00 00 00 00 00 00 00 |................|
00000400 00 be de ad 00 b9 d0 15 aa 02 00 66 6c 00 00 00 |...........fl...|
00000410 14 00 00 00 2d b1 0b 27 34 76 34 30 30 00 00 00 |....-..'4v400...|

The file has some repeated text with v400. Sure enough opening the file in version 4 has no problems and I am able to view all the contacts and even allows me to export as a CSV. Looking at a sample file from a version 4 install confirms the version information.

% hexdump -C "Sample Contact File" | head
00000000 be de ad 01 00 00 00 03 00 07 5c 06 00 01 23 40 |..........\...#@|
00000010 00 00 00 10 00 17 9c 42 00 01 10 92 00 00 00 63 |.......B.......c|
00000020 00 10 37 3a be de ad 00 b7 39 82 7b 00 01 63 6b |..7:.....9.{..ck|
00000030 00 00 00 80 00 00 5d 62 52 66 a1 a9 6f 62 6a 65 |......]bRf..obje|
00000040 00 00 00 86 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 06 00 00 00 81 00 00 00 34 6b 65 79 73 00 00 |.........4keys..|
00000060 00 81 00 00 00 00 00 00 00 00 00 00 00 00 00 01 |................|
00000070 6b 65 79 68 00 00 00 15 76 34 30 30 00 00 00 01 |keyh....v400....|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090 00 82 00 00 00 b4 6e 6f 74 65 00 00 00 82 00 00 |......note......|

Now Software updated the software for the a few years in the early 1990’s. There was Windows versions as well and the format is the same except one detail.

% hexdump -C NC452-Win-s01.NCT | tail
00004ff0 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 32 |...............2|
00005000 00 32 01 90 01 90 00 00 00 08 00 4f 00 32 02 01 |.2.........O.2..|
00005010 03 11 00 00 01 00 00 00 01 18 00 00 00 18 00 00 |................|
00005020 00 32 00 00 00 00 00 00 00 00 00 1c 00 32 00 00 |.2...........2..|
00005030 64 65 52 65 00 00 00 0a 00 01 ff ff 00 00 00 0c |deRe......??....|
00005040 00 00 00 00 00 00 4e fa 69 da 88 7b 00 00 50 54 |......N?i?.{..PT|
00005050 77 69 6e 73 |wins|

It appears in version 4, the final bytes would indicate “wins” or “macs”. This continued in version 5 which came out in 2005.

% hexdump -C NC501-s01.nct | head
00000000 be de ad 01 00 00 00 03 00 00 02 cf 00 00 1e 14 |?ޭ........?....|
00000010 00 00 00 11 00 00 5c 93 00 00 43 7e 00 00 00 50 |......\...C~...P|
00000020 00 00 59 da 00 00 00 34 00 00 00 2c 00 00 05 5e |..Y?...4...,...^|
00000030 00 00 02 cc be de ad 00 e6 02 e9 89 01 00 64 62 |...̾ޭ.?.?...db|
00000040 00 00 00 02 00 00 00 18 00 00 00 00 be de ad 00 |............?ޭ.|
00000050 e6 02 e9 91 63 64 62 68 00 00 00 00 00 00 01 8e |?.?.cdbh........|
00000060 be 63 10 4d 00 7f 63 c8 00 7f 65 64 00 7f 66 d0 |?c.M..c?..ed..f?|
00000070 00 7f 66 bc 00 00 00 00 00 01 00 00 e6 02 e9 8a |..f?........?.?.|
00000080 00 00 27 11 00 00 00 00 e6 02 e9 91 00 00 00 00 |..'.....?.?.....|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00005ad0 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 |................|
00005ae0 01 00 00 00 00 00 00 00 00 1e 00 00 00 00 00 00 |................|
00005af0 00 00 00 1c 00 1e ff ff 00 00 00 00 00 00 00 00 |......??........|
00005b00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 |................|
00005b10 00 32 00 32 02 4f 03 11 00 00 59 da 57 7e a0 f3 |.2.2.O....Y?W~??|
00005b20 00 00 5b 28 6d 61 63 73 |..[(macs|

Also in the version 5 samples, we still see the v400 text, so it appears the format was not changed.

% hexdump -C /Volumes/File\ Formats/Now/NC531-s01.nct       
00000000 be de ad 01 00 00 00 03 00 00 04 90 00 00 1e 14 |?ޭ.............|
00000010 00 00 00 11 00 00 54 83 00 00 43 7e 00 00 00 50 |......T...C~...P|
00000020 00 00 53 5e 00 00 00 34 00 00 00 2c 00 00 05 5e |..S^...4...,...^|
00000030 00 00 02 cc be de ad 00 e6 04 44 0e 01 00 64 62 |...̾ޭ.?.D...db|
00000040 00 00 00 02 00 00 00 18 00 00 00 00 be de ad 00 |............?ޭ.|
00000050 e6 04 44 5c 63 64 62 68 00 00 00 00 00 00 01 8e |?.D\cdbh........|
00000060 41 f7 ad 20 cc 8e b5 01 74 8f b5 01 e8 00 b6 02 |A?? ?.?.t.?.?.?.|
00000070 e4 00 b6 02 00 00 00 00 01 00 00 00 0e 44 04 e6 |?.?..........D.?|
00000080 11 27 00 00 00 00 00 00 5c 44 04 e6 00 00 00 00 |.'......\D.?....|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 be de |..............??|
000001f0 ad 00 e6 04 44 0e 02 00 66 6c 00 00 00 03 00 00 |?.?.D...fl......|
00000200 00 2d b1 0b 27 34 76 34 30 30 00 00 00 01 00 00 |.-?.'4v400......|
00000210 00 00 00 00 00 00 00 00 00 00 00 be de ad 00 e6 |...........?ޭ.?|
00000220 04 44 0e 02 00 66 6c 00 00 00 05 00 00 00 2d b1 |.D...fl.......-?|
00000230 0b 27 34 76 34 30 30 00 00 00 01 00 00 00 00 00 |.'4v400.........|

Now Up-to-Date & Contact released version 5.3 around 2008 which finally provided support for Intel processors. It was the last version released before Now Software attempted a full re-write of the software in 2009 named Now X (code-named “NightHawk”). The software did not receive good reviews and by 2010 the company ceased operations. So far I have come up empty in getting a copy of this doomed version, but I will update this post if I am able to get my hands on a copy.

For now, you can take a look at some sample files on Github, which I will also add some PRONOM signatures to soon.

TurboTax

With all the different file formats that are found in everyday computing, most formats which find their way to my archive have historical value. We know we can’t keep everything and have to assign value to all we decide to keep in for the long term. Some files have sensitive data and we have to follow guidelines for their proper handling. Identification of files helps us know what type of data might be kept inside the format, so often I need to also identify formats we don’t plan on keeping.

I was recently looking through a large digital collection and a report on the files which did not identify in the initial scan. A few popped out to me because of their extension, TAX. Tax records are one thing we need to identify so we can properly handle them, but not likely keep in our repository.

These tax files come from the popular US based TurboTax software. The software gets a new version for every year as tax laws are constantly changing. The software has also been around since 1984, so there are many versions to be aware of. Add to the fact there are personal and business versions along with DOS, Windows, and Macintosh versions, identification might get complicated. None of which are documented in the PRONOM registry. Wikidata is aware of a couple of the extensions, but does not have any signatures to help in identification.

Luckily, this collection of files I was processing had a number of years worth of records. Using them and a few others I was able to put together a decent timeline of formats used, at least from the early 1990’s on. The format seemed to settle on the .TAX extension around the 1994 Windows version. Before this, a group of files in DOS together stored the data. Let’s look at a sample of the 1994 file from Windows.

% hexdump -C TT1994.TAX | head
00000000 54 75 72 62 6f 54 61 78 0d 0a 46 6f 72 6d 61 74 |TurboTax..Format|
00000010 3d 57 49 4e 0d 0a 56 65 72 73 69 6f 6e 3d 31 33 |=WIN..Version=13|
00000020 0d 0a 45 6e 67 69 6e 65 56 65 72 73 53 74 72 3d |..EngineVersStr=|
00000030 36 2e 30 30 2e 31 0d 0a 46 6f 72 6d 73 65 74 3d |6.00.1..Formset=|
00000040 53 31 39 39 34 55 53 31 30 34 30 0d 0a 43 65 6e |S1994US1040..Cen|
00000050 74 73 3d 59 65 73 0d 0a 53 68 6f 77 43 6f 6d 6d |ts=Yes..ShowComm|
00000060 61 73 3d 59 65 73 0d 0a 53 68 6f 77 43 6f 6c 6c |as=Yes..ShowColl|
00000070 61 70 73 69 62 6c 65 57 6f 72 6b 53 68 65 65 74 |apsibleWorkSheet|
00000080 73 3d 59 65 73 0d 0a 44 61 74 61 56 65 72 73 69 |s=Yes..DataVersi|
00000090 6f 6e 3d 31 0d 0a 46 6f 72 6d 46 69 6c 65 53 75 |on=1..FormFileSu|

I love these easy to identify format headers, but then jump to the next year, 1995, and the format changes.

% hexdump -C TT1995.TAX | head
00000000 c0 45 01 5f 0a 00 00 35 b5 06 36 2e 30 30 2e 31 |.E._...5..6.00.1|
00000010 00 00 c7 00 02 00 02 0d 00 00 00 b4 00 00 00 d9 |................|
00000020 00 0e 53 31 39 39 35 55 53 31 30 34 30 50 45 52 |..S1995US1040PER|
00000030 01 01 01 00 00 00 01 00 01 00 00 35 b5 00 0a c8 |...........5....|
00000040 00 01 00 01 09 00 00 00 cf 00 06 00 06 1d 00 00 |................|
00000050 00 3e 00 00 00 3e 00 00 00 64 00 00 00 64 00 00 |.>...>...d...d..|
00000060 00 7e 00 00 00 ce 13 7a 65 7a 50 65 72 73 69 73 |.~.....zezPersis|
00000070 74 65 6e 74 53 74 61 74 75 73 00 65 00 64 00 01 |tentStatus.e.d..|
00000080 00 00 00 00 00 00 ce 12 7a 74 6c 50 65 72 73 69 |........ztlPersi|
00000090 73 74 46 69 6c 65 44 61 74 61 00 00 00 00 00 00 |stFileData......|

The nice easy to read header is gone, but some other patterns start to appear. It seems most of the files from these early versions also used a code near the beginning that may help. “S1995US1040PER”, is similar to the “S1994US1040” in the 1994 file. One could assume the “1040” is the tax form most Americans are used to, along with “US” preceding the number. Then at the end of the string we see “PER”. This may refer to different versions of the Tax software, a Personal for the individual, and a possibly other versions for business as well. I believe TurboTax also had versions for Canadians as well, so there may be many variations on this string. This could get complex. Let’s jump ahead to a 1999 file.

% hexdump -C TurboTax1999.tax | head 
00000000 c0 45 01 5f 0a 00 00 54 6a 16 4c 39 31 30 32 31 |.E._...Tj.L91021|

00000030 00 0e 53 31 39 39 39 55 53 31 30 34 30 50 45 52 |..S1999US1040PER|
00000040 00 00 01 00 00 00 25 00 00 00 00 00 00 00 00 00 |......%.........|
00000050 01 19 12 8f f1 00 0a 00 00 00 00 00 00 00 00 00 |................|
00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c8 |................|
00000080 00 04 00 04 15 00 00 00 ec 05 00 00 c3 07 00 00 |................|
00000090 a3 08 00 00 c4 00 05 46 31 30 34 30 00 00 00 01 |.......F1040....|

The same string is visible, but if course with the year “1999”. We can also see a pattern with the first 4 bytes, “c0 45 01 5f” which seem to be consistent with the 1995 file. The file I have for 1998 is consistent as well. Jumping to the new millennium, we see a change.

% hexdump -C TurboTax2000.tax | head
00000000 c0 45 01 64 0a 00 00 2e 4f 18 4c 30 30 39 32 37 |.E.d....O.L00927|

00000030 00 dc 00 0b 53 32 30 30 30 55 53 31 31 32 30 00 |....S2000US1120.|
00000040 00 01 00 00 00 09 00 00 00 00 00 00 00 00 00 01 |................|

Two changes we see with this file. One, the ASCII string is different. S2000US1120, 1120 being the U.S. Corporation Income Tax Return. So this version of the software was different. The other change is the first 4 bytes. They changed to “c0 45 01 64”, with the last byte changing from 5F to 64. Jumping to 2003, we see the same values.

% hexdump -C TurboTax2003.tax | head 
00000000 c0 45 01 64 0d 00 00 80 1b 26 54 59 30 33 5f 4c |.E.d.....&TY03_L|

00000040 58 03 00 dc 00 0e 53 32 30 30 33 55 53 31 30 34 |X.....S2003US104|
00000050 30 50 45 52 00 00 01 00 00 6a c6 00 00 00 00 00 |0PER.....j......|

Back to a 1040 form, but with the same header as the 2000 file. I am removing some lines, just to be safe and not exposing any personal data. In 2004 we see a major change in the format.

% hexdump -C TurboTax2004.tax | head 
00000000 54 54 46 4e 01 01 6f 68 dc 62 00 00 00 00 4b 01 |TTFN..oh.b....K.|

Again, removing some lines to ensure safety. This header is very different and their is no human readable ASCII in the file, which means it is binary and probably encoded. This header is new, TTFN is what I assume references TurboTax format? file? or possibly, “Turbo Tax Financial Network“?

This header is then used for the next few years ending in 2013, but before we get there, the extension makes a change as well. In 2008, instead of the simple .TAX extension, the software begins to save the tax file with the extension .TAX2008. I don’t have a 2008 document, but I do have a sample 2009 document.

% hexdump -C TurboTax2009.tax2009 | head
00000000 54 54 46 4e 01 01 b5 68 02 24 00 00 00 00 4b 0b |TTFN...h.$....K.|
00000010 01 01 19 13 01 01 01 52 01 01 01 0b 01 01 4e 7a |.......R......Nz|

With the last to use the TTFN header in 2013.

% hexdump -C TurboTax2013.tax2013 | head
00000000 54 54 46 4e 01 01 87 22 6a ec 00 00 00 00 50 bd |TTFN..."j.....P.|

2014 is where I get a little confused. I have one file which uses the TTFN header and another which uses what becomes the standard going forward. But definitely in 2015, the format starts using the ZIP container as a structure for the format. Here is a sample from 2015

% hexdump -C TurboTax2015.tax2015 | head
00000000 50 4b 03 04 2d 00 02 00 08 00 e5 a6 51 48 ba 4d |PK..-.......QH.M|
00000010 43 67 15 06 00 00 10 06 00 00 0c 00 14 00 6d 61 |Cg............ma|
00000020 6e 69 66 65 73 74 2e 78 6d 6c 01 00 10 00 00 00 |nifest.xml......|

If we take a look inside the ZIP container of a 2017 dummy sample.

% 7z l TurboTax2017.tax2017
7-Zip [64] 17.05 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.05 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)

Scanning the drive for archives:
1 file, 769814 bytes (752 KiB)

Listing archive: TurboTax2017.tax2017

--
Path = TurboTax2017.tax2017
Type = zip
WARNINGS:
Headers Error
Physical Size = 769814

Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2026-03-28 20:25:38 ..... 576 581 manifest.xml
2026-03-28 20:25:38 ..... 768688 768923 084A702A-CD3D-4623-B8B7-EE4800BB151F
------------------- ----- ------------ ------------ ------------------------
2026-03-28 20:25:38 769264 769504 2 files

Warnings: 1

The files all seem to have a manifest.xml and a unique identifier. 7-Zip also mentions a header issue with the ZIP files. Something maybe done on purpose? Now comes the odd part, the manifest.xml file does not render as an XML file, it is binary.

% hexdump -C TurboTax2017/manifest.xml | head
00000000 a1 b1 fe fb 37 18 dd 9c 08 2d 9c 86 23 00 10 fa |....7....-..#...|
00000010 12 60 92 bb dc 92 a5 df 1a 24 16 4e a9 28 89 80 |.`.......$.N.(..|
00000020 64 33 66 55 c5 93 f0 68 44 d0 7c f9 56 86 42 2c |d3fU...hD.|.V.B,|
00000030 80 ba 8a 95 2a 82 6d 32 75 84 b1 f1 e2 18 93 5c |....*.m2u......\|
00000040 82 4d 18 f9 ed 23 4f dc d6 b5 7f f2 20 1e 30 59 |.M...#O..... .0Y|
00000050 d5 7f 47 7d aa f5 8d bd 8b 10 20 ec 8a c7 43 df |..G}...... ...C.|
00000060 52 90 a9 70 4d 68 b4 76 fa c8 37 85 f5 56 25 82 |R..pMh.v..7..V%.|
00000070 ea 16 06 54 b0 b4 bc 43 16 fb 70 7b 7a 79 a5 8b |...T...C..p{zy..|
00000080 3c 79 7d ef ac 32 fc 35 ce 0f fa a2 6f e7 c3 a4 |<y}..2.5....o...|
00000090 92 a1 a4 c8 83 dd 9f 32 f4 ea d3 1a eb 89 15 a3 |.......2........|

Of the samples I have which have a manifest.xml, they all begin with “a1 b1 fe fb”. Which apparently is the header for an AES CBC encrypted file. A clever person was able to decrypt the file to reveal the actual XML.

TurboTax isn’t sold on physical disk anymore, but you can download the current tax year version from their website. I am not a user of their product so I am not sure if the latest version still saves files in the same way. If you do use it currently, I would love to know if it is still the same.

So to recap, the headers are:

  • 1994 “TurboTax Format=WIN Version=13
  • 1995-99 “C045015F”
  • 2000-03 “C0450164”
  • 2004-13 “TTFN”
  • 2014-current “ZIP Container”

This should be enough to create five new signatures for identification. Extensions will be a problem since they change very year, but we can add them to the list. With these signatures we can now identify all the tax files we have and set them aside if not needed.