With New Year’s Eve fast approaching, it seems like an especially appropriate time to reflect on the past. Amid the various efforts to make basic government-published legal data more accessible (especially in open formats that are machine readable), I think it helps to also remember how far things have come. Over at the company blog for Citation Technologies, a long-time employee, David Gottlieb, recounts the difficulties the company faced as a legal publishing start-up back in the mid-nineties. David’s post contains various anecdotes, including how the company’s only server was thrown off the roof of the office in a business dispute, but it is also a reminder of the common thread between the challenges of the mid-nineties and today — access to usable information.
Although you commonly hear complaints about government entities that publish information in not-so-open proprietary formats such as PDF files and how a standardized XML markup would be so much better, David’s post explains the difficulties of obtaining any electronic version of state regulations back in the nineties. He also mentions manually correcting bad OCR scans, navigating magnetic tapes, and how a southern state once informed them that there was no way they had regulations in any electronic format because they had only just gotten electric typewriters. It is easy to think that the push for usable government data is new, but its actually a long term effort in which the meaning of “usable” keeps evolving.
I hope a few years from now we will be arguing about how the government should provide deeper semantic markup and look back on the days of PDFs and Word documents in the same way we look at the magnetic tapes and OCR corrections in David Gottlieb’s post.