CSI: Rare Book School, or Computer Forensics in the Archives

Editor’s Note: Jennifer Howard spent a week in early July at the University of Virginia’s Rare Book School, taking a course on “Born-Digital Materials: Theory & Practice.” This is the third in a series of posts on the experience. See Part 1 and Part 2 for more.

Charlottesville, Va. — Meet FRED, also known as a Forensic Recovery of Evidence Device. A black tower with an array of ports and drives, FRED is a forensic workstation sold by a company called Digital Intelligence, based in New Berlin, Wis.

Tall and sleek if not exactly handsome, FRED has presence. “One of the attractions of digital forensics is that the toys are great,” an instructor, Matthew G. Kirschenbaum, told us in a session on archives and computer forensics.

FRED also comes highly touted by its manufacturer. “FRED systems are optimized for stationary laboratory acquisition and analysis,” the company’s Web site promises. “Simply remove the hard drive(s) from the suspect system and plug them into FRED and acquire the digital evidence.” FRED’s drives can read various kinds of storage media, as well as an array of ports (read/write FireWire and USB ports, five read-only or write-blocked ports).

All of that comes in handy if you’re investigating tech security breaches and cybercrime—or if you’re an archivist working with born-digital material. If a donor gives an institution a collection of floppy diskettes and outmoded laptops, FRED can help safely inventory and access what’s in or on that software and hardware without damaging or altering anything.

FRED also costs nearly $6,000. Luckily, there are much cheaper solutions for many problems archivists encounter in handling born-digital materials. Mr. Kirschenbaum produced a 5.25-inch floppy disk (remember those?) and asked us how we could get the contents onto a modern laptop with no disk drive. His bargain-basement solution: Use an FC5025 floppy controller that connects to a stand-alone floppy drive with ribbon cable. A USB port on the FC5025 means that you can read yesterday’s disk format on today’s computers.

“This is not wizardry, this is not rocket science,” Mr. Kirschenbaum told us. “This is a very straightforward mechanical procedure.” With judicious purchases on eBay, and maybe a lucky raid on a stash of old machinery in a campus closet, archivists or scholars can improvise a lot of computer-forensic solutions on the cheap. “This is technology and a capability that is well within the reach of a small, not terribly well resourced institution or even an individual,” he said, as long as you know or can guess a disk’s operating-system format.

Getting access to a machine or storage media is only the beginning. An investigator or archivist then has to catalog and figure out what to do with the contents. Forensic software can identify and sort out sensitive personal information—credit-card numbers, IP addresses, e-mail addresses, and so on. Running that kind of assessment should reassure donors, not alarm them, Mr. Kirschenbaum said.

“The point is not to take their credit-card information and put it on the World Wide Web,” he said. “It’s precisely so you can tell them what’s on their computer.”

Like the hardware involved, forensic software runs the price gamut, from high-end commercial options to freely available alternatives. Mr. Kirschenbaum and Naomi Nelson, the other leader of the class, work with BitCurator, a joint University of North Carolina-University of Maryland project “to develop a system for collecting professionals that incorporates the functionality of many digital-forensics tools.”

Mr. Kirschenbaum walked us through disk imaging—making a bitstream or mirror-image copy of whatever storage media you’re working with, including unallocated space. As criminal or archival evidence, this goes way beyond just copying the logical contents of a file from one place to another. We learned about hex representation and checksums and hashing, which involves using algorithms to make sure that strings of bits are identical.

“This is a way of ascertaining that the digital object has not been tampered with in any way,” he explained. “This is the cornerstone, the bedrock of professional legal forensics.” (For a nifty example of textual scholarship enabled by computer forensics, see the work done by Doug Reside, digital curator at the New York Public Library, on Jonathan Larson’s Rent files.)

In class, we reviewed the basics of a digital-forensics workflow: Sanitize your target drive to make sure there are no ghosts of old data on it (the computer equivalent of washing your hands before you eat); document the original hardware and media before you do anything with them; image the media with write-blocking solutions in place so that you don’t accidentally contaminate it with your own data; use hashing to authenticate the image you create; and be sure you document the whole process.

Done right, digital forensics creates trust and “removes a lot of doubt and anxiety from the donor’s mind,” Mr. Kirschenbaum told us.

Correction, 8/24: This post originally misstated the size of the floppy disks the FC5025 is used to read. They’re 5.25-inch disks.

Return to Top