Wednesday, November 30, 2011

DARPA's Shredder Challenge - almost won already?

Readers may be aware that the Defense Advanced Research Projects Agency (DARPA) is sponsoring what it calls the 'Shredder Challenge'.

Today’s troops often confiscate the remnants of destroyed documents in war zones, but reconstructing them is a daunting task. DARPA’s Shredder Challenge calls upon computer scientists, puzzle enthusiasts and anyone else who likes solving complex problems to compete for up to $50,000 by piecing together a series of shredded documents.

The goal is to identify and assess potential capabilities that could be used by our warfighters operating in war zones, but might also create vulnerabilities to sensitive information that is protected through our own shredding practices throughout the U.S. national security community.

. . .

The Shredder Challenge is comprised of five separate puzzles in which the number of documents, the document subject matter and the method of shredding will be varied to present challenges of increasing difficulty. To complete each problem, participants must provide the answer to a puzzle embedded in the content of the reconstructed document.

There's more at the link.

According to DARPA, there are more than 8,200 registered participants, and the challenge's puzzles have been downloaded more than 72,000 times. It looks like at least one team is getting very close, too, judging by this 'tweet' from the project two days ago:

I'll be watching with interest to see who wins, and how well they do . . . but I'm also puzzled. Germany seems to have sorted out this problem some time ago. I wrote about it in March 2010, in connection with the reconstruction of shredded Stasi files. So, is a German team winning DARPA's challenge? Or has a US team 'borrowed' the German technology? Or has someone - or a team of someones - developed a new, alternative technology to do the same thing? Inquiring minds want to know!

Meanwhile, in case they get this right, I daresay it's time for me to invest in a burn barrel for the back yard, and incinerate my shredded documents in it . . .



Tim said...

All our documents go into the compost. Having just emptied the compost bin into the garden heap I can't see anyone wanting to sift through the contents - apart from the worms and slugs.

Ace said...

Looking at the DARPA puzzles, I don't think I see any that are micro-shredded (but I didn't download all of them).

There are 3 basic types of personal shredders: strip shred, in which the paper is cut into strips (higher end strip shredders create more, thinner strips); cross cut shredders, which first create thin strips (as with strip shredders), then cross cut the strips into pieces, and; micro shredders which perform much like cross cut shredders, but with much finer strips and much shorter cross cuts.

The output from strip shredders is hardly a challenge to reassemble, and the average cross cut shredder adds only a couple of hours per page for manual reassembly.

Micro shredders, however, are a different animal. A good personal micro shredder produces pieces not much over 1/16 inch wide and about 3/16 inch long. A very good micro shredder reduces those sizes by about 1/3.

I'm going to make an educated guess here and assume the reconstruction software challengers are using looks for, first, shapes, then matching edges among the shapes, and - probably - coherent content once shapes and edges are matched, as a final step.

In preventing reconstruction of documents volume is your friend. In theory, even a single micro shredded page could be reconstructed, either manually or by automation. A thousand pages increases the degree of difficulty immeasurably.

It is not unheard of to employ "shredder sheets" composed of meaningless text to accompany real documents through a shredder. If a sandwich is made with a real document page between two shredder sheets, and the sheets are the same type of paper and carry the intelligible, but meaningless, text in the same typeface, not only has the image-based reconstruction problem grown by 300%, but even if reasonably successful,the content analysis problem has also grown substantially (it's not a 300% jump, because at some point the content analysis process will determine that a certain amount of the reconstructed content is a J.D. Salinger book, of a Shakespeare play, or whatever, leading to automatically discarding a percentage of pieces once the algorithms determine that pattern exists).

The challenge to shredding is to make the reconstruction problem resolution intractable; just as software encryption is designed to make the decrypting of a message so time consuming that by the time the solution is reached the information has no value, so does paper shredding seek the same goal (while good, RSA-based encryption, say at 1024 bit (or higher) strength, can in theory be decrypted by throwing the massed computing power of the entire world at it, the solution would take longer than Earth will exist; quantum computing, whenever it's sufficiently developed, may radically change that, however, as it may also reduce the time required for image- and content-based reconstruction of shredded documents).

For most of us a micro-shredder will more than meet our personal security needs; adding shredder sheets improves the security, as does randomly distributing the shredder output among multiple disposal paths. It's highly improbable that anyone short of the NSA and its minions would have the resources, or desire, to attempt a reconstruction of our micro-shredded credit card statements, especially if our neighbors don't bother shredding anything, or use low security shredder technology.