During the recent PRONOM Research Week, I noticed a file format with no description and no signature.
All I had to go on was it was an Adobe format and the acronym “ACD”. One of the first results that came up in a google search was a post in the Adobe forums with someone asking what to do with some old ACD and ACI files they found on a disc, circa 2000, labeled “Adobe Capture”. The only thing I remember about Adobe Capture was some scanning tools related to Adobe Acrobat, but I didn’t remember coming across any ACD files related to Acrobat.
Initially it wasn’t easy to find more information on this format. Eventually I was able to narrow it down to stand-alone software adobe released called “Adobe Acrobat Capture”. Originally released in 1995 it was eventually discontinued in 2010. The software was marketed under the ePaper name and connected to Acrobat through the creation of a PDF from scanned images. The software was compatible with many scanner models and would process the scanned images, run Optical Character recognition, and export to a searchable PDF. These tools are built into Adobe Acrobat today.
One of the reasons the software was being so elusive is the fact it was sold with a high price tag and required the use of a hardware key, or dongle, in order to process scans. The hardware key also managed the type of license you purchased which may limit the number of pages you are allowed to scan within a certain period of time. So the software is very difficult to run today, if you do happen to find a copy out there in Internet land.
In order to document these file formats for preservation purposes I needed to find some samples. I was excited to find a demonstration CD on the Internet Archive, but unfortunately it contained no examples of the ACD file format.
A little sleuthing on the Wayback Machine helped me find a few user guides and brochures. I was also able to find there was three versions of Adobe Acrobat Capture. In a Product Brochure, you can see a screenshot of the software with a document open with the ACD extension.
If you are OCD like me you might have noticed the window in this screenshot is typical of the older Windows 3.1 or Windows NT system. So this was indeed an older product released by Adobe.
The Adobe Acrobat Capture 3.0 Demonstration CD-ROM from the Internet Archive luckily has a UserGuide PDF on the disc and was able to help me understand the ACD format a little more.
Looks like the ACD format is an intermediate format used by the software to manage the process between scanning and export to PDF. ACD was also defined as an “Acrobat Capture Document” which makes sense. They were also mentioned as being “multipage files in Acrobat Capture Document (ACD)”. The UserGuide also mentioned an ACP format which it referenced as “one-page files are in Acrobat Capture Page (ACP) format.” So more research is needed.
Lets start with Adobe Acrobat Capture 2.0 as I managed to get a few samples from an installer I found. Here is a hexdump of an ACD file and its corresponding ACI file.
hexdump -C CONTRACT.ACD | head 00000000 02 04 47 47 c9 00 86 b5 01 00 b6 27 02 00 01 00 |..GG.......'....| 00000010 f5 00 5e 00 3b 96 02 00 01 6e 63 6a 00 00 88 68 |..^.;....ncj...h| 00000020 00 00 26 00 44 3a 5c 43 4f 44 45 5c 47 47 5c 50 |..&.D:\CODE\GG\P| 00000030 52 4f 44 55 43 54 2e 33 32 53 5c 49 4e 5c 63 6f |RODUCT.32S\IN\co| 00000040 6e 74 72 61 63 74 2e 61 63 69 00 00 00 00 00 00 |ntract.aci......| 00000050 7c 33 c0 27 00 40 ff ff ff 00 03 00 03 00 00 00 ||3.'.@..........| 00000060 00 00 00 00 00 00 40 00 00 00 00 00 00 03 00 00 |......@.........| 00000070 00 00 00 00 00 00 00 40 00 00 00 00 09 00 0a ab |.......@........| 00000080 04 0b 14 b5 04 39 19 00 40 00 00 00 00 0c 14 b0 |.....9..@.......| 00000090 04 38 19 b0 04 08 00 0a 7f 06 d3 11 89 06 39 17 |.8............9.| hexdump -C CONTRACT.ACI | head 00000000 49 49 2a 00 b3 0c 02 00 35 80 78 a0 80 35 c0 78 |II*.....5.x..5.x| 00000010 a4 80 35 40 3c 54 40 01 e2 b2 01 e2 b2 01 e2 b2 |..5@<T@.........| 00000020 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 |................| 00000030 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 |................| 00000040 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 |................| 00000050 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 |................| 00000060 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 |................| 00000070 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 01 e2 b2 |................| 00000080 01 e2 b2 01 e0 b0 01 e0 b0 01 e0 b0 01 e0 b0 01 |................| 00000090 e0 b0 01 e0 b0 01 e0 b0 01 e0 b0 01 e0 b0 01 e0 |................|
The ACD file is unique, PRONOM and even TrID was unaware of the format. But to the keen observer, the ACI format is very recognizable. You may have seen this header before:
Lets take a closer look at an ACI file to see if they are a true TIFF image or if there is any customization to the format.
tiffinfo CONTRACT.ACI === TIFF directory 0 === TIFF Directory at offset 0x20cb3 (134323) Subfile Type: (0 = 0x0) Image Width: 2544 Image Length: 3295 Resolution: 300, 300 Bits/Sample: 1 Compression Scheme: CCITT RLE Photometric Interpretation: min-is-white Samples/Pixel: 1 Rows/Strip: 32 Planar Configuration: single image plane Software: HALO Desktop Imager exiftool -D CONTRACT.ACI - ExifTool Version Number : 12.60 - File Name : CONTRACT.ACI - Directory : TUTORIAL/SAMPOUT - File Size : 134 kB - File Modification Date/Time : 1995:07:10 16:02:08-06:00 - File Access Date/Time : 2023:11:14 15:41:02-07:00 - File Inode Change Date/Time : 2023:11:08 08:34:18-07:00 - File Permissions : -rwxrwxrwx - File Type : TIFF - File Type Extension : tif - MIME Type : image/tiff - Exif Byte Order : Little-endian (Intel, II) 254 Subfile Type : Full-resolution image 256 Image Width : 2544 257 Image Height : 3295 258 Bits Per Sample : 1 259 Compression : CCITT 1D 262 Photometric Interpretation : WhiteIsZero 273 Strip Offsets : (Binary data 625 bytes, use -b option to extract) 277 Samples Per Pixel : 1 278 Rows Per Strip : 32 279 Strip Byte Counts : (Binary data 448 bytes, use -b option to extract) 282 X Resolution : 300 283 Y Resolution : 300 305 Software : HALO Desktop Imager - Image Size : 2544x3295 - Megapixels : 8.4
Looks like a true TIFF image with no special tags or unique properties. They are 1-bit TIFF’s compressed with CCITT RLE. Not sure there would be any need to create a special signature for these ACI files.
Looking closer at the ACD file format, we can see they reference ACI files, so probably safe to assume the ACD file doesn’t contain the full raster data for each image:
hexdump -C Report.acd 00000000 02 04 47 47 c9 00 9a 8b 00 00 d4 ce 00 00 03 00 |..GG............| 00000010 f5 02 5f 00 00 61 01 00 01 6e 63 6a 01 00 30 5f |.._..a...ncj..0_| 00000020 00 00 27 00 63 3a 5c 63 61 70 74 75 72 65 32 5c |..'.c:\capture2\| 00000030 73 61 6d 70 6c 65 73 5c 6f 75 74 5c 52 65 70 6f |samples\out\Repo| 00000040 72 74 5f 30 30 30 31 2e 61 63 69 00 00 01 00 00 |rt_0001.aci.....| 00000050 00 00 00 00 00 00 00 00 00 00 e8 03 00 00 01 00 |................| 00000060 01 00 00 00 00 00 00 00 00 00 08 00 52 65 70 6f |............Repo| 00000070 72 74 30 31 00 00 00 00 70 33 d8 27 00 40 ff ff |rt01....p3.'.@..| * 00005f40 07 00 40 6f 00 09 00 40 01 6e 63 6a 02 00 52 2c |..@o...@.ncj..R,| 00005f50 00 00 27 00 63 3a 5c 63 61 70 74 75 72 65 32 5c |..'.c:\capture2\| 00005f60 73 61 6d 70 6c 65 73 5c 6f 75 74 5c 52 65 70 6f |samples\out\Repo| 00005f70 72 74 5f 30 30 30 32 2e 61 63 69 00 00 00 00 00 |rt_0002.aci.....| 00005f80 00 00 00 00 4e 0c fe ff ff ff e8 03 00 00 01 00 |....N...........| 00005f90 01 00 00 00 00 00 00 00 00 00 08 00 52 65 70 6f |............Repo| 00005fa0 72 74 30 32 00 00 00 00 4c 31 f0 27 00 40 ff ff |rt02....L1.'.@..|
From the limited sample set I have access, all the ACD files begin with the same Hex values, “02044747C900”. Along with the common header we can assume there should be at least one ACI file referenced in the first part of the file. Because it is referenced as a filepath, the ACI string would be variable in its offset.
Adobe Acrobat Capture 3.0 turns out to be a different format. But looks familiar………
hexdump -C Contract.acd | head 00000000 50 4b 03 04 14 00 00 00 08 00 3b ba 6e 57 23 9d |PK........;.nW#.| 00000010 8e b8 3d 00 00 00 3e 00 00 00 09 00 40 00 46 49 |..=...>.....@.FI| 00000020 4c 45 53 2e 4c 53 54 0a 00 20 00 00 00 00 00 00 |LES.LST.. ......| 00000030 00 00 00 80 e6 e9 ca 50 17 da 01 80 e6 e9 ca 50 |.......P.......P| 00000040 17 da 01 80 e6 e9 ca 50 17 da 01 4e 55 18 00 4e |.......P...NU..N| 00000050 55 43 58 09 00 46 00 49 00 4c 00 45 00 53 00 2e |UCX..F.I.L.E.S..| 00000060 00 4c 00 53 00 54 00 8b 76 74 76 31 8c e5 e5 f2 |.L.S.T..vtv1....| 00000070 0c 76 f6 f7 0d f0 0f f6 0c 71 b5 0d 09 0a 75 e5 |.v.......q....u.| 00000080 e5 f2 0b f5 75 f3 f4 71 0d b6 35 e4 e5 02 31 fc |....u..q..5...1.| 00000090 1c 7d 5d 0d 6d 9d f3 f3 4a 8a 12 93 4b f4 12 93 |.}].m...J...K...| sf Contract.acd --- siegfried : 1.10.1 scandate : 2023-11-15T09:10:01-07:00 signature : default.sig created : 2023-10-11T15:10:17-06:00 identifiers : - name : 'pronom' details : 'DROID_SignatureFile_V114.xml; container-signature-20230822.xml' --- filename : 'Contract.acd' filesize : 79002 modified : 2023-11-14T23:17:53-07:00 errors : matches : - ns : 'pronom' id : 'x-fmt/263' format : 'ZIP Format' version : mime : 'application/zip' basis : 'byte match at [[0 4] [78886 3] [78980 4]]' warning : 'extension mismatch'
Yep, its a zip container file. lets take a peek inside to see what it is composed of.
7z l Contract.acd -- Path = Contract.acd Type = zip Physical Size = 79002 Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ 2023-11-14 23:17:54 ....A 62 61 FILES.LST 2023-11-14 23:17:54 ....A 410 226 Contract.acd 2023-11-14 23:17:52 ....A 150213 78093 Contract.acp ------------------- ----- ------------ ------------ ------------------------ 2023-11-14 23:17:54 150685 78380 3 files
The the Contract ACD file is like a nesting doll, an ACD within an ACD. Lets see what the ACD and ACP is made of.
hexdump -C Contract.acd | head 00000000 00 01 00 00 00 02 04 47 47 2d 01 9a 01 00 00 02 |.......GG-......| 00000010 00 00 00 02 00 01 01 00 00 00 01 00 00 00 04 04 |................| 00000020 00 00 00 09 00 57 69 6e 67 64 69 6e 67 73 05 00 |.....Wingdings..| 00000030 41 72 69 61 6c 0b 00 43 6f 75 72 69 65 72 20 4e |Arial..Courier N| 00000040 65 77 0f 00 54 69 6d 65 73 20 4e 65 77 20 52 6f |ew..Times New Ro| 00000050 6d 61 6e 05 01 00 00 00 02 00 00 00 78 01 00 00 |man.........x...| 00000060 0f 00 54 69 6d 65 73 20 4e 65 77 20 52 6f 6d 61 |..Times New Roma| 00000070 6e 00 00 00 20 0b 00 00 c0 0a 00 00 00 00 00 00 |n... ...........| 00000080 00 06 00 00 00 0f 00 54 69 6d 65 73 20 4e 65 77 |.......Times New| 00000090 20 52 6f 6d 61 6e 00 00 00 20 0c 00 00 00 0c 00 | Roman... ......| hexdump -C Contract.acp | head 00000000 25 50 44 46 2d 31 2e 33 0d 25 e2 e3 cf d3 0d 0a |%PDF-1.3.%......| 00000010 31 20 30 20 6f 62 6a 0d 3c 3c 20 0d 2f 54 79 70 |1 0 obj.<< ./Typ| 00000020 65 20 2f 43 61 74 61 6c 6f 67 20 0d 2f 50 61 67 |e /Catalog ./Pag| 00000030 65 73 20 32 20 30 20 52 20 0d 2f 53 74 72 75 63 |es 2 0 R ./Struc| 00000040 74 54 72 65 65 52 6f 6f 74 20 34 20 30 20 52 20 |tTreeRoot 4 0 R | 00000050 0d 2f 43 41 50 54 5f 49 6e 66 6f 20 3c 3c 20 2f |./CAPT_Info << /| 00000060 56 20 33 30 31 20 2f 46 53 20 5b 20 28 57 69 6e |V 301 /FS [ (Win| 00000070 67 64 69 6e 67 73 29 28 41 72 69 61 6c 29 28 43 |gdings)(Arial)(C| 00000080 6f 75 72 69 65 72 20 4e 65 77 29 28 54 69 6d 65 |ourier New)(Time| 00000090 73 20 4e 65 77 20 52 6f 6d 61 6e 29 5d 20 2f 4c |s New Roman)] /L|
The ACD has some of the same hex values as the previous version, but with some extra bytes at the beginning and it looks like the ACP is a straight up PDF. But may have some interesting tags, like “CAPT_info”.
The problem we will face when trying to write a signature for this version of ACD is the container signature needs a static file name to reference, and it appears the name of the container is also the name of the ACD file within the container. So every file will be different. I wish there was a way in the PRONOM signature syntax to reference an extension and ignore the filename, but currently there no method to do this. The only thing inside the container which seems to be consistent is the file “FILES.LST”. So lets take a peek inside if it.
hexdump -C FILES.LST | head 00000000 5b 41 43 44 31 5d 0d 0a 49 53 43 4f 4d 50 4f 53 |[ACD1]..ISCOMPOS| 00000010 49 54 45 3d 54 52 55 45 0d 0a 4e 55 4d 46 49 4c |ITE=TRUE..NUMFIL| 00000020 45 53 3d 31 0d 0a 46 49 4c 45 4e 41 4d 45 31 3d |ES=1..FILENAME1=| 00000030 43 6f 6e 74 72 61 63 74 2e 61 63 70 0d 0a |Contract.acp..|
Ok, there seems to be some static information that is unique to the ACD format. I bet the string “[ACD1]” would be sufficient enough to make a solid signature.
This is a good format example of a limited amount of information on the file format used by a well known company which has become obsolete and disappeared. Take a look at my signatures, maybe you have some old ACD files you were unaware of!
The ability to accommodate variable filepaths in a container signature would be a dream!