In the last post, I described the three major components of the project. This post describes my general plan for constructing the first component: The Media Archive.
1.a) Scan Photos
The first step in the whole project is to scan family photos into a digital format. My wife and I have a few prints from when we were growing up but they are relatively newer prints than most of those held by other members of the family. I have decided to begin with photo sets passed down from my grandparents. Thanks to my aunt Linda, who lent me her collection, I have been able to begin experimenting with scanners and software.
I'll save the technical details of the scanning tools, settings, and procedures for a future post (you can't wait can you?). The general idea is to capture the images at a very high resolution and color depth with all of the "extras" provided by the scanner software (e.g. sharpening, dust & scratch removal) turned off.
The resulting digital image is saved in a file format that uses either no compression or lossless compression. Many popular graphical file formats (including JPEG) use lossy compression which sacrifices some of the information contained in the file in order to greatly reduce the file size. While most of the time the loss of information goes unnoticed (even with more than 90% of the information removed), deleting information runs counter to the entire purpose of the archive: to preserve as much information as possible.
All of this is done so that I obtain digital versions that are as close to the original print as possible.
1.b) Store Master Images
Because of all of the choices made in the scanning step, the resulting files are so large as to be completely impractical for direct uses (especially for web-based purposes). Instead, the raw scans are kept unmodified as master versions. Only copies of the master files are "touched-up" and scaled down to practical sizes. This way, all the information that was captured during the scan is always maintained at the master.
Besides storage is cheap! Today, you can get hard drives at about $0.08 per gigabyte and it will only get cheaper. Even if each image took up a whole gigabyte, you could store 2,500 images for only $200.
Of course, it's not that simple. Hard drives have moving parts which eventually wear out. A hard drive lures you into a sense of security until you have all of your important information stored on it with no backups and then fails catastrophically. Automatic data redundancies and regular backups are critical components of the Media Archive. A future post will explain how I plan to store
1.c) Index Originals
Occasionally, it may be necessary to retrieve the master file for an image. Perhaps to create a new print or blown up version. Similarly, the original print might be required. An image indexing system is required for either of these situations.
Each image created from the archive should be linked back to the master image and the original print. The name of the file could be used to associate an image with it's master but file names are easily and often changed. Some image file formats contain fields for storing metadata which could be used to store the indexing information. However, with this method, some table would be required to look up the image file name of the master from the information contained in the metadata of an image.
Additionally, the original prints must be stored in a way that allows them to be retrieved through the index. An index number could be written in archival ink onto the back of each image as it is scanned. Better yet, index cards or table of contents pages could be used to identify the index for small groups of prints.
1.d) Touch up & Downscale
With the master copies and original prints indexed and safely stored, the next step is to create usable versions. Any filters to improve the image are applied at this step (e.g. dust & scratch removal, sharpen, further cropping ...). If extensive work is required, the product is first saved in the same format as the master and archived with it. This way the touch ups do not need to be reapplied for any future versions made from that master.
There are many ways that the images can be scaled down to practical sizes. For example, black and white images can be converted to grayscale. The resolution can be reduced and the image stored in a file format that includes compression (even lossy compression if desired).
The downscaled images should then be employed in an accessible way. This might include digital photo frames, photo organization software, or online photo streams for sharing with the world.
The last step for a photo in the Media Archive is to be tagged with descriptive information. Generally, photo organization software and websites provide a slot for a text description of what is depicted.
However, most modern options provide a tag cloud feature. This allows photos to be assigned a set of descriptive phrases. This feature is used to identify individuals, places, or things in pictures and automatically associate that image with others tagged similarly.
I will not recognize the people and places depicted in all of the pictures. But, by publishing them on the web, I can leverage the collective knowledge the whole family in tagging media in the archive.
Ok, I know that was a lot of information all at once but it provides a general overview of the design of the Media Archive.