Digitization: It’s not easy, it’s not fast, and it ain’t cheap.

by on August 14, 2012

Everyone, at some point, gets the idea to digitize something. For most people, it’s old photos, but some people get the idea to scan documents or books. Their assumption is that scanning is fast, so the whole thing won’t be much of a problem.

When we start digitization projects, people are always shocked we don’t have things online within days. They will give us thousands of items, and ask “Why isn’t it up yet”?

What they don’t understand is that often, the scanning part is the shortest part. There is a whole lifecycle to digitization that gets glossed over. People often don’t realize the level of post processing needed for digitized items, or the level of organization for even a small number of files.

After that, there’s metadata that is possibly the most important part, and it could very well take longer than digitization.  Bad metadata will destroy a digital project, making it practically useless online.  Good metadata is an artful organization of a digital mess, and the quality of the metadata makes it so that the items can live online and be functional.

After that, there’s a whole segment about putting the items in an organized system online, then the maintenance of the collection and possibly marketing it to different user groups.  Systems cost money to purchase and maintain.  Even free software has its costs in installing, set up, maintenance, and upgrades.

Then, there’s the question of putting things online as they are scanned, or all at once at the end. We will always tend toward putting stuff up all at once at the end because that’s the most time efficient way of doing it. It could take 20 minutes per item to put everything in one by one (20,000 minutes for 20 items), or 1 minute per item to load it all at once (just 1,000 minutes). Given a limited set of resources, the batch load is the obvious answer.

The take away is that Digitization is not just about scanning.  Scanning is the easy part to see, so it’s the part that people think is the major part.  Scanning, however, is just the tip of the iceburg.



