Hacking against the Ministry of Truth

This is a LazyWeb request.

Many people think the Internet Archive is a reliable tool for keeping governments and other organizations honest. They think that if items are altered or removed on a Web site (so as to maintain apparent consistency between a government’s past statements and current reality, for example), the Wayback Machine can be relied on to retain the previous versions of those items.

Unfortunately this is not true, for three reasons.

  1. The Internet Archive does not have working copies of items that require interaction with a server. These include streaming audio and video.
  2. Even for static items, the Internet Archive can be instructed to stop archiving them (and to delete previously archived versions) using robots.txt or similar.
  3. The Internet Archive’s coverage of the Web is not thorough, so it may miss some revisions. Furthermore, it is a single Web site and therefore vulnerable to lawsuits and other attacks.

Democracy-minded hackers could develop Web site archiving software that solves most of these problems. Specifically, it would:

  1. incorporate software for saving QuickTime, Real, Windows Media, and Ogg streams (even if embedded in Flash or Shockwave objects) as playable static files;
  2. ignore robots.txt, and disguise itself perfectly as a random popular Web browser (this includes not requesting items too hurriedly or too predictably);
  3. operate only on individually specified Web sites (like www.whitehouse.gov), using a BitTorrent-like protocol to cooperate with other clients in building up an archive of the site and its revisions (so that it cannot be blocked by IP number).

I apologize for making this request now — too late for some uses — rather than when I first thought of it.

Comments are closed.