Course content Course content

Digital humanities: humanities research in the digital age

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

More free courses

Session 3 Metadata and search

This session is written by Anne Alexander from The University of Cambridge.

Download this video clip.Video player: Video 3

Show transcript|Hide transcript

Transcript: Video 3

The key idea in this session is that data is constructed and not found. That data is always a representation of the world, it isn't the world. And we'll also look at how an additional set of complications in how data represent the world arise when humans create data for other humans to read using machines.

How, when, and by whom data is constructed is sometimes relatively easy to see. But at other times, this information needs to be uncovered or inferred. Traditional humanistic approaches developed by scholars working with material texts can be adapted to working with digital data.

In this session, you'll be applying a medieval interpretive practice, the accessus ad auctores to digital texts. The accessus was a section inserted into the beginning of manuscripts in which copyists and authors answered the following questions.

Who is the author? What is the subject matter of the text? Why was the text written? How was the text composed? When was the text written or published? Where was the text written or published? And by which means was the text written or published?

Some of these questions are similar to those posed and answered by metadata, which is to say data about data. As we will see metadata like the data it describes, also involved multiple interpretive decisions and plays a major role in the visibility or invisibility of texts, images, and sounds. This is especially true when accessing data over the internet.

Metadata is enormously important to search engines, for example. These are very powerful tools. But like any other method for finding and sorting information, researchers should not simply accept their results as a given. One of the outcomes of this session should be a deeper critical awareness of how search engines work and what kinds of questions to ask of them.

End transcript: Video 3

Download

Video 3

Interactive feature not available in single page view (see it in standard view).

After studying this session, you should have:

understood the difference between data and metadata
familiarised yourself with the basic elements of a search engine
reflected on some of the ways in which search technologies shape scholarship and research.

In the last session you learned about the processes of transformation changing material texts, recordings of images and sounds on physical media into digital data. Johanna Drucker observes there is a tension between this radical change from material to digital and the idea that ‘data’ is a set of observations, an empirical recording of the world as it is. She proposes reconceiving ‘data’ (something which is given) as ‘capta’ (something which is taken).

From this distinction, a world of differences arises. Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact.

(Drucker, 2011)

It is crucial to remember the constructed nature of ‘data’ produced by digitisation processes as we grapple with the next challenge: how to make find the needles in this digital haystack of information? Now that everything has been flattened out into a machine-readable series of 1s and 0s how can humans make sense of it?

There are three circles, featuring coded text.

Figure 3

Show description|Hide description

There are three circles, featuring coded text.

Figure 3

This is where metadata comes in. Metadata is data about data. It is the information attached to data to explain who made it, what format it has been captured in, where and when it was made and a host of other things which the authors or curators of the data want to share. Metadata predates digital data: e.g. a library catalogue records metadata about the material items in the collection. However metadata for digital data becomes even more important than for its analogue counterpart. Have you ever lost a crucial file on your computer because you called it DRAFT_1.doc or something equally forgettable? At least with a book on your bookshelf your memory may be jogged by colour or even the wear and tear along the spine.

Like data, metadata gives off an appearance of objectivity, of simply being an empirical recording of facts about the data. But it is as much the result of subjective choices by humans and their spectrum of motives, beliefs and opinions, and training in different professions and disciplines. Even within the same profession or discipline, there can be wide variation in standards of metadata meaning that even versions of the same original text could end up being described in different ways.

Previous 2.3 Session summary

Next 3.1 Seeing through machines

Take your learning further

Making the decision to study can be a big step, which is why you’ll want a trusted University. We’ve pioneered distance learning for over 50 years, bringing university to you wherever you are so you can fit study around your life. Take a look at all Open University courses.

If you’re new to university-level study, read our guide on Where to take your learning next, or find out more about the types of qualifications we offer including entry level Access modules, Certificates, and Short Courses.

Want to achieve your ambition? Study with us and you’ll be joining over 2 million students who’ve achieved their career and personal goals with The Open University.

Browse all Open University courses

My OpenLearn Profile

About this free course

Become an OU student

Share this free course

Session 3 Metadata and search

Transcript: Video 3