4.14 Technologies and explicit knowledge continued
Metadata is descriptive data about data. This has also come to refer to a way of tagging documents (on the Web or any other repository) with structured, descriptive information. For example, to describe a unit in B823, we would expect to have concepts such as title and author, but perhaps also prerequisite or core concepts. Translated into a metadata scheme, this might appear as follows (typically metadata fields use <angle brackets> to delimit each metadata tag).
Note that some of the metadata tags simply describe the content of the unit, while the last two actually describe particular kinds of relationships to other units. Unit 7 is not a PREREQUISITE-FOR B823-Unit-3, it BUILDS-ON it. This is specialist knowledge supplied by the unit authors: it may not be explicitly in the content of the unit's text. Metadata schemes are designed to be machine interpretable, like ontologies, but also to be accessible to people without extensive training. You can see that the scheme above looks more like a series of structured keywords. Often, the average member of staff would not even have to see this cryptic notation since it would be entered and displayed using an online form.
Information such as this enables much more powerful searching than is otherwise possible, once a search engine has been given the description of the particular metadata scheme (that it uses ‘tags’ for title, course, prerequisites, etc.). The search engine can then provide a query form with fields for these tags, and users can search specifically for course units which, say, have ‘interpretation’ as a CORE-CONCEPT, or all units that BUILD-ON B823-Unit-3 and are a PREREQUISITE-FOR B823-COMPLETION. This is impossible to do without the equivalent of structured metadata, but is still the state in which much of the Web exists – we have to simply search using keyword combinations, and then manually filter out a lot of irrelevant documents returned by the search engine. By thinking carefully about the metadata scheme's design (that is, by designing an ontology of the domain), we can transform a mass of documents into ‘knowledge-enriched’ materials whose content and interrelationships can be accessed without having to read every document.
The terms which one chooses to encode in a metadata scheme are typically derived from an analysis of the field, and the kinds of search queries that one anticipates supporting. They may be derived from an organisational or community taxonomy (hierarchy) of terms, or a library indexing scheme. One can, however, go one step further and formally model the relationships between terms such that a computer can reason in more sophisticated ways about the relationships between terms. If we declared that an <AUTHOR> IS-A <SCHOLAR> who PUBLISHES <ARTICLES> and IS-AFFILIATED-TO an <INSTITUTION>, then we are creating a richer model, which is called an ontology.