Many companies realise that information is a key asset, but the amount of information commonly generated and stored can make it very difficult to get a grip on it all. So how can you manage this knowledge, enabling people to cut through to the information that they need and do so quickly?
Knowledge Maps vs Search Engines
There are two main types of solutions to the problem of access to information:
- indexes or maps
- search engines
Search engines like Google do away with the need for indexes or maps in many situations, but maps still have their place.
The most basic form of knowledge map is a listing of resources, both people and documents, on paper and/or delivered as web pages. Such 'portal' sites proliferate on the web, compiled by enthusiasts, companies and search engines to point to key resources in one or more fields. Although technically simple to implement in its most basic form, such a resource requires a dedicated person or team to keep the classification of categories and descriptions of people up to date.
Some companies with dedicated knowledge managers, librarians or information services may distribute a standard set of intranet and internet 'bookmarks', which makes available to staff a coherently organised set of information sites directly from their desktop. Of course, this can also happen on an informal level as colleagues swap new discoveries.
We can imagine the advantages of having a dynamically updated information system which would not need to wait for the following year’s release before it could be modified. However, this needs to be weighed against the advantages of paper. Paper is portable, it never crashes, it is easily annotated, and so forth.
A better strategy is a user-centred design approach, which would typically consult representative user groups to uncover their most common information needs (how do they think about their world?). Users may be asked to build an actual map of their world, and conduct evaluations of the current search engine to see how well it performed.
However, it is also known that users do not always know what they need, and moreover, new tools can change the way people behave by presenting opportunities they did not imagine. Think about how we rapidly retrieve known information sources on the internet with search engines such as Google. It is often quicker to type in a few keywords in a search engine toolbar than to take the trouble of manually bookmarking the page, or retyping the full address if known.
The answer to this dilemma is in planning time for deploying a series of prototypes and evaluating the emergent patterns of usage. These patterns cannot be foreseen in advance, but are a function of the specific user group working under the demands of their unique contexts.
Human vsvs machine generated metadata
Metadata is data that is used to describe other data. Metadata about the contents of an information resource, or the way in which people behave, can come from people or computers. The contrast is between (human) declared structure or (machine) inferred structure. To make this clearer, consider some examples:
- Portals (topic-centred websites with categorised, validated links to relevant information). The categories can be either predefined by the portal’s designers, or automatically clustered by analysing the content of linked information.
- E-commerce website customer profiles. Customers can either select their shopping interests from a predefined list of topics, or the site can try to analyse their interests by tracking their purchases, and inferring what other kinds of products they might buy.
- Document/news classification. Users can either be asked to assign keywords that provide a machine-readable summary of the document or news item (such as terms selected from a controlled vocabulary), or the system can try to analyse the text and build an abstraction of its 'meaning'.
The advantages of asking humans to classify and abstract is that, on a case by case basis, they may do a better job than a machine (but someone has to define the categories, staff have biases, an incomplete awareness of who may want to find the document, and are notoriously poor in the use of complex categorisations). Metadata from trained librarians or information scientists (typically employed only in large organisations) is likely to be high quality in terms of consistency and coverage. Metadata from other staff may still be useful, but varies in quality.
The advantage of having machines infer the meaning of texts or images is that they are not so vulnerable to these human traits, and can more easily keep up with the changing information space as new material is published. Tools concerned with text mining, information extraction, thesaurus generation/maintenance and ICT network usage analysis will continue to grow in sophistication.
Machine-generated analyses of raw texts, images, and user activity patterns, therefore, may be a promising way to support the construction of meta-knowledge that can answer questions such as, 'what do we know?', 'what’s out there?', and 'what do people do?', since they do not require people to modify (and perhaps change) their behaviour by explicitly categorising their work products or processes.
However, machines are, of course, restricted in their ability to make sense of artefacts and processes, having an extremely small porthole onto the rich human world. If a machine can do 75% of the summarising, reference or classification work that a human can, this may or may not be adequate. It may be sufficient for classifying news stories about competitor products, but not for classifying intelligence about terrorist activities! This is the dilemma facing search engines – how to create the remaining 25%.
About this article
This article is based on extracts taken from the Open University Business School course Managing Knowledge.