By Jay Velgos
It's difficult to imagine anything that's grown as fast
in scope and sheer magnitude as the Internet.
But there is one thing that's grown even faster:
users' expectations of the Internet.Great Expectations
Only a few years ago here at the Texas State Archives, many historical researchers were more or less content to write letters to us on actual paper (aka snail mail) to request photocopies ("hard copy") of particular documents, for which they were prepared to wait almost a full week before calling and checking on the status of their order.
Today, we receive e-mails that basically read something like this: "Dear State Archives, I've searched all through your web site and have not been able to locate my great-great grandfather's military service record. Please tell me where I may find it. Thank you very much."
Or "Dear State Archives: Please e-mail me the 1880 census. Thanks."
In other words, the public not only wishes to see State Archives resources in digital form, they expect it. Now.
In July 1998, the Texas State Library and Archives Commission unveiled "Texas Treasures," a new web site featuring digitally scanned images of important resources from the State Archives' vast holdings. Among the "treasures" available for viewing online are William B. Travis's famed letter from the Alamo; county-by-county vote totals from the 1845 Texas referendum on annexation; a collection of flags from the Texas Revolution and the Civil War, and images from the Historic Map Archive.
Snazzy logo notwithstanding, the Texas Treasures web site is actually just a showcase of several pilot projects that the TSLAC has undertaken to explore digital imaging technologies. Without a budget for a full-scale imaging program, these pilot projects are limited in scope, but useful for accessing the capabilities and limitations of various technologies.
For this article, I will describe the process of getting high-resolution images of 40 historic maps available online through the TSLAC's web site. This project is representative of the technologies involved, the difficulties presented, and the lessons I have learned along the way.
Bang the Drum Slowly
Our map imaging program begin with an unexpected offer. A cartographer from another state agency had expressed a professional and personal interest in examining certain historic maps, in order to compare them with other modern topography maps and satellite images. An arrangement was made for him to scan a number of large maps, under secure conditions, on his agency's $100,000 drum scanner, and provide us with the image files. Unlike a flatbed scanner, where an item is placed face down on a plate of glass and exposed to a light source and scanning elements, a drum scanner requires the item to be fastened to a revolving drum, where a laser reads the image point by point. To our relief, we quickly determined that the maps could be adequately scanned while still encased in their protective, transparent mylar film sleeves. Had this not been the case, the experiment would have ended immediately, as subjecting the fragile paper maps to a high-speed mechanical process would have exposed the maps to far more risk than we could permit.
The good news is that this arrangement ultimately yielded high-resolution, full-color images of 40 maps from our collection in the industry-standard TIFF format. To our chagrin, however, each image file ranged from 20 to 40 megabytes. We hadn't anticipated files of this size. After all, how could we make use of files that are too big to download on a standard modem, and likely to crash all but the most maxed-out computers? Next step: getting those file sizes down. Way down.
Smaller. . . Smaller. . .Not That Small!!!
I fired up my Power Macintosh, launched Adobe PhotoShop, and got to work. I was guided by two almost antithetical objectives: I wanted small files sizes that could be downloaded and viewed through a web browser; and I also required that the smallest details--usually the fine print to identify rivers--would still be readable. As I systematically reduced the resolution from the original 200 pixels per inch (ppi), I zoomed in until the finest details were still readable, but just barely. This step reduced the files sizes, but never more than 15 percent or so.
It turned out, the real culprit for the huge file sizes was the color palette. The "palette" describes the range of unique colors available for an image to display. The larger the palette, the more bits of memory required to describe each individual pixel of that image. A black and white image (without any gray) requires only a single bit to define each pixel, either totally black or totally white. To display a palette of 256 colors (which is not even adequate for most color photos) requires 8 bits to describe each pixel. The map images had been scanned to a bit depth of 16, permitting a palette of more than 65,000 colors. While this is an effective palette for most color photos, I determined that this was an opportunity to significantly reduce the file sizes.
Reducing the range of colors would be an acceptable sacrifice for retaining fine detail on the maps. So just as I had systematically reduced the resolution on each map, I began to systematically reduce the color palettes, a few bits at a time: 14 bits (16,000 colors); 12 bits (4,096 colors); 10 bits (1,024 colors); 8 bits (256 colors); 7 bits (128 colors); and to 6 bits (64 colors). I was surprised at how much difference a single bit could make: at one setting, a map would appear normal; at one setting lower, all the colors would shift and the map would take on a garish, fluorescent character. (During this process, I made regular use of PhotoShop's "Undo" feature.) I found that some of the maps could be reduced to 64 or even 32 colors with little change, while others required 1,024 or more colors to maintain any resemblance to the original.
Nobody's Perfekt. . .
It was also during this step that imperfections were revealed in a number of the map images, appearing as occasional vertical lines of misaligned pixels. We believe they are either scanner glitches or errors introduced during the file transfer process. However, because these errors detract only from the aesthetics of the images, but not their usability, we have not viewed them as a major problem.
Let me note here that our objective for these images was not to provide photoquality representations of these maps. Rather, our intent was to provide high-resolution reference images that would permit researchers full access to the details of each map without having to visit the State Archives and handle the original. Access and preservation, the two fundamental missions of the State Archives, would thus be achieved.
The last step in modifying the files was saving them in JPEG format, a type of image file that, unlike TIFF, is readable by web browsers. JPEG's other advantage is that it also offers a high level of compression, but in what as known as a "lossy" format. This means that the significant reduction in file size comes at the price of some minor loss of detail. It is for this reason that saving in a JPEG format must be the last step in your image-editing process. Further editing and saving of JPEG files results in cumulative loss of detail.
The Thrill of Victory, the Agony of the Crash
In the end, the 40 map image files that originally had been 20 to 40 megabytes in size were reduced to files of only 1 to 4 Megabytes in size. I considered this a personal victory.
That is, until I took them on a web browser "test drive." The visible details proved extraordinary, but because the web browser was incapable of "zooming out," examination of the complete image required awkward navigation using the horizontal and vertical scroll bars. Worse, viewing three or four images in a row proved to crash even the most robust browsers on any platform: Netscape, MS Internet Explorer (MSIE), PCs, Macs, it didn't matter. It wasn't pretty.
It was at this point that I began to consider Plan B. . . Or, rather, desperately try to come up with a Plan B. The root of the problem seemed to be that, although web browsers are excellent tools for viewing text, low resolution photos, and even dancing, animated graphic files, they are simply not designed for displaying multi-megabyte files. It is akin to stacking furniture instead of using a ladder; sooner or later, something's going to break, and it might well be your neck. Shareware to the Rescue The proper software tool for the job is dedicated image viewer. PhotoShop is the best, but at $800, it is not reasonable to require users to have it.
Shareware to the Rescue
An online visit to www.shareware.com revealed a number of try-for-free image viewers, for both PCs and Macintoshes, that are perfectly capable of displaying large-size JPEG files, as well as providing zooming and printing capabilities. For Windows machines, Lview is a good choice; for Macintoshes, JPEGView and GraphicConverter work exceptionally well.
It would be easy enough to warn the human beings against attempting to use Netscape or MSIE to view these large images, but preventing those applications from taking the initiative would prove more challenging. By design, any file with a ".jpg" extension would automatically entice the browser to attempt to display it; for these large files, the risk was catastrophic failure.
It seemed prudent, then, to convert the files to yet another format, one that would encourage web Netscape and MSIE to save the downloaded map files to disk, rather than display them. The ZIP format met that criteria. As many people know, the ZIP format is traditionally used to compress the size of text files in order to speed up download time. "Zipping up" the already-compressed JPEG files, however, did not yield any additional compression. But it would keep the browsers from crashing.
"Say That in English, Please!"
Whew! All this technology. . . Scanning to TIFF files, reducing resolution, changing the color palette, compressing as a JPEG file, zipping it up. . . Who knew that the hardest part still lay ahead? I had to explain to our online visitors how to view these map images.
1.
Click the icon to download the map to your disk
2.
If you have the proper utility, UnZIP the file;
3.
If you don't have the proper utility, go to www.shareware.com
and download WinZip (for PCs) or Stuffit Expander (for Macs); install as necessary,
and Unzip the file.
4.
If you have an image viewer application, launch it and open the map image file;
5.
If you don't have an image view application, return to www.shareware.com and
download Lview (for PCs) or GraphicConverter or JPEGView (for Macs). Install
as necessary, and view the file.
6.
Most important: if you have difficulty with any of the above steps, please do
NOT contact the Texas State Archives for technical assistance.
Now, even for a computer geek like me, this sounded awfully complicated, and I worried that all my efforts would end up wasted. Would people wade through the instructions only to decide it wasn't worth the hassle?
To my surprise, many online patrons have found the process to be worthwhile. I've heard from individuals who have downloaded all 40 maps. Others have written to say that they prefer the electronic images over the actual maps; they can print out sections, write on them, and examine them in their own homes.
I won't deny that other online visitors have had difficulties. I've even received venomous e-mails from some who were particularly frustrated. And despite my admonition in step number six, I do periodically get calls or e-mails requesting assistance. Most of those calls end with me saying, truthfully, "I'm sorry, I don't do Windows."
The process of getting these 40 maps online (just 6,960 to go!) has been far from perfect, its execution far from elegant. Even today, downloading a 3 or 4 megabyte file remains beyond the capabilities of many computer users. But for me, it's been a first step, a learning experience, and a humbling glimpse at the enormity of work yet to come. Because, after all, the public has great expectations.
Jay Velgos is communications coordinator in the Archives and Information Services Division of the Texas State Library and Archives Commission.
TLJ Table of Contents
TLA Home Page