4.18 Technologies and explicit knowledge continued
4.18.1 Data mining
Data mining refers to techniques for analysing databases or information systems to try to identify hidden but significant patterns that are not possible to detect by standard querying of the database.
Moxon defines data mining as follows:
Data mining is a set of techniques used in an automated approach to exhaustively explore and bring to the surface complex relationships in very large datasets … most likely implemented in relational database management technology. However, these techniques can be, have been, and will be applied to other data representations, including spatial data domains, text-based domains, and multimedia (image) domains.
Data mining … uses discovery-based approaches in which pattern-matching and other algorithms are employed to determine the key relationships in the data. Data mining algorithms can look at numerous multidimensional data relationships concurrently, highlighting those that are dominant or exceptional.
Data mining techniques seek patterns such as associations, sequences and clusters in databases. Ideally, these should have predictive power to enable informed planning and decision making. To call the output from such tools ‘knowledge discovery’ is a little over-inflated. We have already seen that what counts as useful knowledge will depend on the human's interpretation of the information presented: are the associations, sequences and clusters meaningful and significant with respect to the task at hand?