BY CHARLES BEARDEN
The first time I sat down to scan and work with an image (not all that long ago), I was thoroughly intimidated by the plethora of settings, options, and operations available in the scanning and image manipulation software. I had the vague notion that the JPEG format was preferable for photographs, but that was about the extent of my computer graphics savvy. I came to learn, however, that a modicum of knowledge about digital images - along with attention to a few crucial aspects of scanning and image manipulation - was enough to enable one to make a good beginning in the creation of graphics for the Web.
In this article I hope to impart that modicum of basic knowledge about images and their creation that will enable beginners to approach a scanner and graphics software with some confidence and, with a bit of thought and practice, do good basic image work for the Web. I will confine myself to the basics of scanning and image manipulation and specifically to those basics relevant to the Web. If you are already comfortable with histograms and tonal curves, layers and channels, threshold and halftones, then you will probably find this article rather rudimentary. You will probably also detect a few deliberate oversimplifications.
We begin with a basic discussion of the salient characteristics of digital images, including image file formats, color depth, file size, and resolution. My goal is to give you the tools you need to evaluate your original images and their places in the destination web pages and to choose the appropriate format, color-depth, and dimensions based on that evaluation.
A guided exploration of the interface to one piece of scanning software - to determine how to make the settings arising out of your evaluation of the images - is included in this text; you will also find information on a few simple image manipulations in Adobe Photoshop. This electronic article is illustrated with screenshots of the software used and contains links to additional example images on a website at http://riceinfo.rice.edu/Fondren/ETC/scanning/examples/
Digital images are organized in picture elements, or pixels. Each pixel corresponds to one of the thousands of tiny dots of colored light on your computer display. At normal viewing distances, humans perceive the pixels in aggregation as an image, rather than as individual dots. Each pixel contributes its color to the aggregation in order to create the shapes we see on the screen. It's a bit like seeing the forest instead of the trees. Image files store information about the color value of each pixel constituting the image. Graphics applications render images by interpreting the pixel color information in these files and setting the color values of the corresponding pixels in the display. Modern color monitors can display millions of different colors, but they do so by combining varying shades of only three colors: red, green, and blue. This color scheme, also known as RGB, differs from the natural primary colors of red, yellow, and blue, as well as from the cyan/yellow/magenta/black (or CMYK) scheme used in printing.
Image file formats fall into two broad categories, depending on how they store pixel color information: bitmapped formats and indexed formats. Bitmapped images, including TIFFs (Tagged Image File Format) and JPEGs (Joint Photographic Experts Group), are capable of representing three different ranges of color, or color depths: 24-bit, or true color, 8-bit, or grayscale, and 1-bit. In 24-bit color files, 8 bits are used to record red color information, 8 bits for green, and 8 bits for blue, for a total of 16,777,216 possible colors. 24-bit color is ideal for images with complex color schemes, but because each pixel uses 24 bits to store color information, such files are very large unless compressed. Eight-bit color, or 'grayscale,' uses 8 bits per pixel to record up to 256 different shades of gray. Grayscale is excellent for images with many gradations of light and dark, but where color per se isn't important, such as black & white photos. Because 8 bits are used to store color information for each pixel, grayscale images tend to be roughly 1/3 the size of true color images. In 1-bit color, each pixel can only be black or white with no shades of gray. They are thus roughly 1/8 the size of grayscale images and 1/24 of true color images. 1-bit color images are good for black and white line art, such as some maps and cartoons, and for achieving certain effects (see illustration at left). Their use on the Web is somewhat limited, and indexed formats such as GIFs are generally used instead.
Indexed images, such as GIFs (Graphics Interchange Format) and PNGs (Portable Network Graphics), handle color information differently than bitmaps; they reduce the color scheme of the original to a limited palette generally containing 256 or fewer colors. Instead of storing color information directly, each pixel stores the ID number (at most 8 bits long) of its color in the palette index. The color depth of indexed images is thus effectively 256 colors, but it is not limited to shades of gray. Indexed images are further optimized for size with compression and, when they have few colors, by reducing the length of the index string for each color to less than 8 bits.
Because of their limited color palettes and relatively small size, indexed images are a good choice for images with few colors and high contrast. Indexed images also support certain special features, such as transparency, interlacing, and animation. In indexed images, you can designate certain pixels (usually those in the background color) to be transparent so the color of the web page behind them shows through. Interlacing will cause the whole image to appear initially as a big blur, but gradually resolve itself, instead of filling in from top to bottom like a window shade. Animation effects are achieved by storing color information for several image frames in one file. The frames are displayed in rapid succession by browsers or other image software.
INTERLACED VS. NON-INTERLACED GIFs
There are four particular image formats of importance for Web work: TIFF and JPEG (both bitmapped), and GIF and PNG (both indexed). TIFF images offer the highest quality, but they are by far the largest files and are not natively viewable in most web browsers. They are therefore unsuitable for use over the Internet, but their quality makes them the ideal archival format. When you save a TIFF, your image software may offer you compression and bit-order options. Even though the normal compression applied to TIFFs is lossless, I recommend that you store your archival images uncompressed, as compressed files are more susceptible to corruption. Bit-order options are typically platform-specific. If you are using capable image-processing software such as Photoshop or the GIMP (Gnu Image Manipulation Package), platform-specific bit order won't matter.
The other important bitmapped format is JPEG. It is the format of choice on the Web for displaying images with many colors and subtle gradations of color and shade, such as photographs, most paintings, and manuscripts when details of appearance (rather than mere legibility) are important. JPEGs reduce the huge size of true color and grayscale TIFFs by applying 'lossy compression.' That is, they lose some color information, but they do so in ways that take advantage of limitations in humans' ability to perceive small differences in color. A JPEG version of a photograph may be impossible for most users to distinguish from its TIFF counterpart, but yet be only 1/8 the size of the TIFF. Graphics applications typically offer a sliding scale of image quality when you save JPEGs the lower on the scale, the smaller the file, but the more color information is lost. I find that a setting of '7' in Photoshop is a good compromise between file size and image quality for most web purposes.
RELATIVE QUALITY OF JPEG AND GIF
Among indexed graphics file formats for the Web, GIF is far and away the most popular. It is well suited for logos, line art, most cartoons, most maps, and other images with simple color palettes and high contrast. As noted above, it supports transparency, interlacing, and animation. It is, however, a proprietary format, trademarked by CompuServe, though it is free for use by creators and users of web pages.
PNG, the other important indexed format, was developed as a nonproprietary alternative to GIF and supports all features of the GIF format except animation. It is, however, not supported in pre-4.x Netscape or in pre-5.x Internet Explorer. If you wish to use PNG files on your website, I suggest you make both GIF and PNG versions of your indexed images, and use JavaScript or server-side includes to check the browser version and send PNGs only to newer browsers.
Another
set of factors to consider when scanning images for the Web is the resolution,
scale, and dimensions of the image, both of the original and of the digital
versions. When scanning, resolution is measured in dpi (dots per inch) or ppi
(points or pixels per inch) and refers to the number of color samples taken
by the scanner, along both the width and length of the object being scanned.
Each color sample becomes a pixel in the image file. Thus, a 4" x 6"
photo scanned at 150 dpi will yield a file 600 x 900 pixels, containing 540,000
pixels. (By the way, you can multiply the number of pixels in the image by the
bit-depth to get a rough idea of how large the image file will be as an uncompressed
TIFF.)
With respect to display on a computer monitor, however, resolution is measured in terms of the absolute dimensions of the display in pixels. There are three common display resolutions: 640 x 480 pixels, 800 x 600 pixels, and 1024 x 768 pixels. The bottom line is that, when preparing an image for the Web (or for other display-oriented applications), you must (1) choose a display resolution for which you are designing your pages, and (2) scale your images according to how much of the assumed screen you want them to occupy, as well as according to the file size in bytes. I recommend that you design either for 640 x 480 or 800 x 600 in order to make your pages more easily used by those with older or lower-end hardware. Since horizontal scrolling is generally considered much more troublesome than vertical scrolling, the width, not the height, of the image is usually the primary consideration.
When evaluating an image for scanning, ask yourself some questions:
The answers to these questions will help you select the proper destination file format, color depth, and dimensions during scanning and manipulation of the image.
We will now illustrate the above points with the interface from the scanner I used. Most scanning software offers similar options to the ones illustrated below, perhaps under slightly different labels. Check your scanner documentation for more information.
Click on the preview for a larger image that will load in a separate window. Close (or move) the new window to continue with the article. Please be patient; these are large images.
If you invoke your scanning software from within your image manipulation program, the scanning software will 'hand off' the scanned image to the image manipulation program after the final scan. You can then work with the image and save it when you have finished. If you start the scanning software up by itself, it will prompt you for the destination location, filename, and file format for saving the scan. You can open the file later if you wish to modify it.
|
If you scan an image in a bitmapped color-depth and pass it on to an image manipulation program, and then save it, you will notice that the GIF option is not active: |
In Photoshop, you convert an image to indexed color by using the Image => Mode => Indexed Color menu option. |
|
If you revisit the Save menu option after doing so, you will notice that you can now save the image as a GIF: |
To resize an image in Photoshop, select the Image => Image Size menu option: |
You can see what the pixel dimensions are, and you can enter new values to change the image size. The (1) chain links indicate that when you modify one dimension, the other will automatically be modified to preserve the proportions of the image.
PROBLEMS MANIPULATING INDEXED IMAGES
Adobe Photoshop, now in version 5.0, is widely considered to be the premier image manipulation tool for both the Macintosh and the PC. It is expensive but very full-featured, and it permits many advanced operations on images. PaintShop Pro is a shareware program for the PC that represents a less costly alternative to Photoshop. I've never tried it, but while it doesn't have all of Photoshop's features, it does have a reputation for being an excellent value for the price.
In the Unix world, there is now an application that offers many of Photoshop's features: the GIMP (Gnu Image Manipulation Package). It is freely available, but not yet as well documented as Photoshop. That is changing, however. At home, I use the GIMP exclusively, running under Linux. XV is another classic shareware image tool for Unix. There is also a useful free suite of tools called ImageMagick, many of which have extensive command-line options, which are useful for scripted batch operations.
If you are doing image work, you will need a video card with a minimum of 4MB of video memory in order to see what the colors will really look like. If you work often with large images, then you should have at least 8MB of video memory. If you are using high-end graphics software like Photoshop or the GIMP, you should also have a minimum of 64MB of RAM.
When shopping for a scanner, the most important factors are optical resolution, dynamic range (also called optical density), and color fidelity. Look for a scanner that has at least 600 x 600-dpi optical resolution. Scanners often provide higher figures under the name 'interpolated resolution,' but the scanner doesn't actually sample the original at the interpolated rate in order to achieve the higher resolution. It samples at its highest optical resolution and then uses software to interpolate (i.e., make a best guess at) what the color values of the non-sampled pixels should be. Ignore interpolated resolution figures.
Dynamic range refers to the range of shades in an image between pure white and pure black. It is measured on a scale of 0 (white) to 4 (black). A scanner's dynamic range is computed by subtracting its DMin value (at the white end of the scale, 0.1 or 0.2 for most scanners) from its DMax value (at the black end of the scale). Look for a dynamic range of at least 2.8 or a DMax value of 3.0 or better. These figures should be found in the technical specifications for the scanner.
Assessing the color fidelity of a scanner is difficult. One way would be to scan color and grayscale bar charts and view the resulting images on a computer with high-quality, calibrated video hardware.
You may also wish to investigate the availability of accessories, such as transparency and slide adapters and automatic document feeders.
Now that you've learned the basic concepts associated with digital images, I think you can see that with a little thought and practice, you can make the most important decisions in scanning correctly. Computer graphics are potentially complex, but you can build on the fundamentals presented here and on the supplemental websites in a stepwise fashion to learn new techniques and manipulations. Learn new features and tools as you need them, one or two at a time. Most of all, don't be afraid to experimen - just remember to save an unmodified, high-quality TIFF of the original scan.
For more information on:
SCANNING
PHOTOSHOP
GRAPHICS
All scanning for the example images was done with an Epson Perfection 636U scanner attached to a Macintosh G3, using the Epson TWAIN scanning software.
Charles Bearden is electronic resources librarian at Rice University, Houston, Texas