Skip to content
Skip to main content

About this free course

Become an OU student

Share this free course

Digital humanities: humanities research in the digital age
Digital humanities: humanities research in the digital age

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

3.3 Web crawlers and metadata

For web search engines there is another, equally important stage which is the building of the index by crawling the web. Programmes known as crawlers (or spiders) retrieve information from items published online in order to create the index which users of the search engine will query.

Metadata lies at the heart of search engines and building the index. It’s a shortcut to the contents which allows the search engine to present likely matches to the users’ queries in the split-second timings that we are used to.

Web search businesses such as Google have become powerful and wealthy companies by developing techniques which appear to miraculously remove the difficulties of making sense of the vast amounts of published information we encounter. They return search results effortlessly, in a simple listing taking us straight to what we want to know. How do they do it? A lot of the ‘magic’ relies on techniques of filtering and ranking, in other words revealing and concealing some of the content, and ordering in a way which the search engine designers believe will be useful to us.