Skip to content

The Dangers of Automation in Digital Projects

by on October 11, 2012

With digital stuff, it’s easy to think that things can be automated.  You’re dealing with data, and XML.   You’re dealing with computers and scanners.  The temptation is to create scripts that do batches for you.  To a certain extent, this works.  For image processing, for example, batches are great.

The problem comes in metadata.  We have a large collection of digitized books that has been moved from system to system automatically for years.  Mostly, it’s because at different times we put stuff in different places, and then decided to bring the collection together a little at a time.

The end result of all this automated data manipulation and fudging to make things fit?  The collection is a Frankenstein’s monster as far as the metadata is concerned.  Same data in different fields, multiple fields with duplicate data, missing data.  This causes a huge problem for metadata harvesters.

Why are metadata harvesters so important?  Well, we’ve realized that 88% of the people who visit our collections come from external systems that harvest our metadata.  The top one is Google (specifically Google), the next one is our federated search.  The other 12% use the native DSpace interface.  Most of the 12% is us going in and adding and working on collections.  So, most of our users use harvested metadata to find our stuff.  If our metadata isn’t harvestable, or if it’s difficult to harvest, or if the result is confusing to the patron,  then the central means of discovery for our collections are useless.

We have two student assistants doing nothing but going through and making the data all homogeneous.  Making things the same.  We can do this now because the collection is not unmanageable.

So be protective of your item level metadata.  Make sure things are consistent and follow the guidelines for OMI-PMH metadata harvesting.

Advertisements

From → Uncategorized

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: