Get SF Weekly Newsletters
Pin It

Virtual Gutenberg 

An Oakland company is quietly preserving our intellectual history -- with technology

Wednesday, Dec 26 2001
Above Jack London Square's Bed Bath & Beyond, where you can buy a floral photo album for $9.99, a small band of dedicated book-lovers works to preserve some of the most valuable volumes in existence. Octavo Corp. and its staff of eight have revolutionized the conservation and accessibility of rare books, using technology in the service of history. This month they're starting work on the most famous book in the U.S., the Library of Congress' pristine copy of the Gutenberg Bible.

Through a combination of hardware -- lights, cameras, and a lot of servers -- and software, the company produces digital reproductions of rare books, which it then sells to consumers, libraries, and scholars for $25 to$75 per CD-ROM. Among its 26 finished products are versions of Newton's Opticks, the "First Folio" edition of Shakespeare's plays, and Chaucer's Works (from William Morris' exquisite Kelmscott Press edition). Because the books are photographed with high-resolution digital cameras rather than taken virtual via a flatbed scanner, the images are immensely detailed; as a result, you can zoom in tight, enlarging (without pixelization) up to 600 percent on every page. In addition, the company often adds what it calls "live text," a transcribed copy of the words placed directly behind the image, which can be searched, copied, and printed just like any Word document. The result of all this effort is that regular folks like you and me can read, scrutinize, study, and admire the great books of yesteryear, even if we can't afford to collect them.

In his book Double Fold: Libraries and the Assault on Paper, Nicholson Baker rants about the destruction of books and newspapers in the service of "preservation." He argues in his passionate, rambling tirade in favor of keeping originals around for posterity. The problem with this idea, of course, is that rare, precious books can't be checked out of the local library and taken to Starbucks for a quick browse over a latte. To make them accessible for study (and for viewing by people in other locations), they have to be reproduced somehow. Baker mentions Octavo briefly at the end of his rant, describing its work this way: "[T]he resulting scans have a serene luminosity and depth of detail." This paragraph is one of the only positive things Baker has to say about anything in his 288-page book. The praise is deserved, but too muted: Baker should have jumped up and down and shouted to the rafters. In my opinion, Octavo is quietly saving our intellectual history.

In June Octavo was asked by the Library of Congress to digitize its Gutenberg Bible. One of only three perfect (i.e., complete) vellum copies in the world, the book is held in a cold storage vault under extremely tight security. That means very few people will ever set eyes on the real thing, much less flip its pages. But the folks at Octavo begin shipping equipment to the library this month, with the goal of setting up an on-site lab; Senior Technologist Hans Hansen hopes to have the imaging done by mid-February. The pictures could be online as soon as March and available for purchase by the fall, depending on the library's plans. Though the British Library offers a scanned version of its Gutenberg Bible copies, Octavo's edition, with live text and accompanying commentary, should be far superior.

Octavo was founded in 1997 by John Warnock (now Octavo's chairman of the board) of Adobe Systems Inc., the software company. E.M. Ginger (now executive editor) and Hans Hansen, Octavo's second and third employees, both contracted with Adobe Press. The connection is significant: You access Octavo's books through Adobe's Acrobat software. It's also important that so many of the principals have technology backgrounds, because it takes some high-tech dedication to archive rare books digitally. Stored digital data degrades over time, so Octavo holds copies of its master images in servers around the world; automated software then checks one image against another, looking for and correcting any errors that may crop up. In addition, CD-ROMs have a predicted life span of only 10 to 15 years, so Hansen must constantly make duplicates.

The process of digitally reproducing a book, however, is surprisingly simple. A photographer sets the book on a cradle made from acid-free paper, itself set on a light table. Cool white lights shine onto the book from above at a slight angle. A 4-inch-by-5-inch camera with a high-resolution digital scanner at its back -- similar to the apparatus used by forensics teams and fine-art photographers -- hangs over the book. After much checking and testing, the next step, as Ginger describes it, is "push the button, turn the page, push the button, turn the page." The process is so straightforward that Octavo sells portable labs to libraries so they can do the work themselves. Images from Octavo's shoots -- as well as those from portable labs, if the libraries choose -- speed directly to a server, where the real magic begins.

Ginger's office faces a bright window, which makes it somewhat difficult to see the screen of her PowerBook G4 notebook as she pages through Octavo's edition of Horae Beatae Mariae ad usum Romanum, an illuminated book of hours from the Library of Congress' Rosenwald Collection. Zooming in close, I catch the fine cracks in the maroon leather binding, which is gorgeously tooled in gold. Vivid illustrations burst off the screen, despite the glare. The texture of the vellum pages, the faint rule lines under the calligraphy, every tiny stain and discoloration -- it's all clear. We search on the word "Amen" and a spread pops up with the word highlighted. Of course, the Horae Beatae is in Latin, so we can't search in English, but a translation comes with the CD-ROM, as does an essay on illuminated manuscripts by an expert, Donald Jackson.

Given how cheaply the company sells the finished product -- you can download read-only versions of many books for just $10 -- how can it survive? Octavo has been surprisingly savvy in this regard. It not only sells CD-ROMs and portable labs, but it also consults on preservation issues with librarians and conservationists and subcontracts its own labs and storage servers for libraries around the world working to digitize their collections. The biggest expense was likely developing the software in the first place, and that's where the Adobe background (not to mention its funds -- Warnock was a co-founder of Adobe Systems and a rare book collector) came in handy. The company's goal is to create an unparalleled digital library comprised of the seminal works in every field, from science and engineering to art, architecture, and literature.

Libraries in the physical world have an impossible mission: Safeguard old books, but make them accessible to patrons. If making them accessible means letting any old slob handle them, such books will soon be too damaged to enjoy. Scanning books works OK, but you can't zoom in very close, you can't search on the text, and no one's being especially careful about archiving such images. Octavo's system seems to solve all of these problems at once, in an affordable way. In, say, 10 years, we could all have access to any rare book we want. It's an amazing prospect, I swear it -- with my left hand on a Gutenberg Bible.

About The Author

Karen Silver


Subscribe to this thread:

Add a comment

Popular Stories

  1. Most Popular Stories
  2. Stories You Missed


  • clipping at Brava Theater Sept. 11
    Sub Pop recording artists 'clipping.' brought their brand of noise-driven experimental hip hop to the closing night of 2016's San Francisco Electronic Music Fest this past Sunday. The packed Brava Theater hosted an initially seated crowd that ended the night jumping and dancing against the front of the stage. The trio performed a set focused on their recently released Sci-Fi Horror concept album, 'Splendor & Misery', then delved into their dancier and more aggressive back catalogue, and recent single 'Wriggle'. Opening performances included local experimental electronic duo 'Tujurikkuja' and computer music artist 'Madalyn Merkey.'"