3.3 Web crawlers and metadata
For web search engines there is another, equally important stage which is the building of the index by crawling the web. Programmes known as crawlers (or spiders) retrieve information from items published online in order to create the index which users of the search engine will query.
Metadata lies at the heart of search engines and building the index. It’s a shortcut to the contents which allows the search engine to present likely matches to the users’ queries in the split-second timings that we are used to.
Web search businesses such as Google have become powerful and wealthy companies by developing techniques which appear to miraculously remove the difficulties of making sense of the vast amounts of published information we encounter. They return search results effortlessly, in a simple listing taking us straight to what we want to know. How do they do it? A lot of the ‘magic’ relies on techniques of filtering and ranking, in other words revealing and concealing some of the content, and ordering in a way which the search engine designers believe will be useful to us.