4.3 Technologies and meta-knowledge continued
4.3.1 Mapping what we know
Knowledge maps are often one of the first knowledge management representations to emerge, in an effort to add value over the simple corporate intranet search which returns lists of ‘hits’ that are undifferentiated beyond a ranking in terms of keyword matches. Knowledge maps, like other forms of cartography, should communicate a ‘big picture’ by overlaying meaningful structure on to raw resources.
Box 4.3 Information cartography
A company called Dynamic Diagrams () has many examples of techniques to manually map the structure of a website (or, by extension, knowledge resources in an organisation's intranet), as well as novel techniques for mapping documents or sites based on automatic analysis of their content.
Top down or bottom up?
Activity in this field sometimes comes under the banner of organisational knowledge taxonomies. Such taxonomies are typically defined by a specialist group in what we might call a ‘top-down’ manner: defining a set of a categories which must then be used in a systematic manner to classify material. However, a cautionary note is sounded by Davenport for those who believe they can unilaterally define a useful ‘master taxonomy’ for all staff:
It is tempting when managing knowledge to create a hierarchical model or architecture for knowledge, similar to the Encyclopaedia Britannica's Propaedia, that would govern the collection and categorisation of knowledge. But most organisations are better off letting the knowledge market work, and simply providing and mapping the knowledge that its consumers seem to want. The dispersion of knowledge as described in a map may be illogical, but is still more helpful to a user than a hypothetical knowledge model that is best understood by its creators, and rarely fully implemented. Mapping organisational knowledge is the single activity most likely to yield better access.
Knowledge managers can learn from the experience of data managers, whose complex models of how data would be structured in the future were seldom realised. Firms rarely created maps of the data, so they never had any guides to where the information was in the present.
In contrast, a better strategy is a user-centred design approach, which would typically consult representative user groups to uncover their most common information needs (how do they think about their world?), possibly ask users to build an actual map of their world view using a card sorting exercise, and normally conduct evaluations of the current search engine to see how well it performed. However, it is also well known in the field of user-centred software requirements analysis that end-users do not always know what they want (echoing Polanyi, 1966 : ‘we can know more than we can tell’), and, moreover, new tools can change the way people behave by presenting opportunities they did not imagine. Think about how we rapidly recover known information sources on the internet with search engines – it is often quicker to type in a few keywords in a search engine toolbar than to take the trouble of manually bookmarking the page, or retyping the full address if known. The answer to this dilemma is in planning time for deploying a series of prototypes and evaluating the emergent patterns of usage – patterns which neither users nor designers could foresee in advance, but which are a function of the specific user group working under the demands of their unique contexts.
Human- versus machine-generated metadata?
Metadata about the contents of an information resource, or the way in which people behave, can come from people or computers. The contrast is between (human) declared structure or (machine) inferred structure. To make this clearer, consider some examples:
Portals (topic-centred websites with categorised, validated links to relevant information). The categories can be either predefined by the portal's designers, or automatically clustered by analysing the content of linked information.
E-commerce website customer profiles. Customers can either select their shopping interests from a predefined list of topics, or the site can try to analyse their interests by tracking their purchases, and inferring what other kinds of products they might buy.
Document/news classification. Users can either be asked to assign keywords that provide a machine-readable summary of the document or news item (such as terms selected from a controlled vocabulary), or the system can try to analyse the text and build an abstraction of its ‘meaning’.
The advantages of asking humans to classify and abstract is that, on a case-by-case basis, they may do a better job than a machine (but someone has to define the categories: staff have biases, an incomplete awareness of who may want to find the document, and are notoriously poor in the use of complex taxonomies). Metadata from trained librarians or information scientists (typically employed only in large organisations) is likely to be high quality in terms of consistency and coverage. Metadata from other staff may still be useful, but vary in quality.
The advantage of having machines infer the ‘semantics’ (meaning) of texts or images is that they are not so vulnerable to these human traits, and can more easily keep up with the changing information space as new material is published. Tools concerned with text mining, information extraction, thesaurus generation/maintenance and ICT network usage analysis will continue to grow in sophistication. Machine-generated analyses of raw texts, images and user activity patterns, therefore, may be a promising way to support the construction of meta-knowledge that can answer questions such as, ‘what do we know?’, ‘what's out there?’, and ‘what do people do?’, since they do not require people to modify (and perhaps change) their behaviour by explicitly categorising their work products or processes. However, machines are, of course, restricted in their ability to make sense of artefacts and processes, having an extremely small porthole on to the rich human world. If a machine can do 75 per cent of the summarising or classification work that a human can, this may or may not be adequate (compare classifying news stories about competitor products with classifying intelligence about terrorist activities).
The answer, typically, lies in carefully designed hybrid systems that release machines and people each to do the things they do best, to an appropriate performance threshold. ICT can analyse a network and cluster candidate metadata terms for a corporate portal designer to edit. ICT can also suggest alternative synonyms to an untrained user entering keywords, in order to match them to a corporate taxonomy. Users should be able to suggest new terms that they find meaningful in their context, but it normally needs a taxonomy/ICT expert to add this to the official taxonomy.