Information on the web
Information on the web

This free course is available to start right now. Review the full course description and key learning outcomes and create an account and enrol if you want a free statement of participation.

Free course

Information on the web

4.4 Quality, not quantity

4.4.1 Ranking

It is common for a web search to return hundreds, or even millions, of hits; certainly too many to check. But uncannily, the first few hits often contain just what you were looking for. How do search engines manage this seemingly miraculous feat?

The answer lies in techniques used to rank pages so that the 'best' are first. Each search engine uses different techniques, tweaks them continually, and guards the details jealously. But there are some general principles.

One approach is to use information that is on the original page. For example, greater weight can be attached to words that appear in the page title or in headings, or when a word occurs frequently on a single page.

A different approach is to include some human activity, albeit indirectly. For example, Google weights a page more heavily if it finds that other pages have links to it. It is making the assumption that people only create links to pages that have proved useful. This can be applied in a circular manner, so links are worth more if they are from pages that are themselves highly ranked. Another method is to note whenever a page is chosen from the results list and weight that page more heavily; after all, if one person thought it useful it is likely that someone else will.

Most search engines also combine information from their directory and spidered index. Pages that appear in the directory can be given additional weight and so appear near the top of the hit list in a search. Google applies ranking to its directory, so not only are sites in the directory manually chosen, but they are ranked in order using the techniques above.

Activity 28

Why do you think search engines are reluctant to reveal details of their ranking techniques?

4.4.2 Query rewriting

Some search sites provide additional features that help you to refine your search. For example, the search site can keep a log of all the searches that people make. When you enter a search, the search engine can suggest searches that were similar.

Activity 29

Go to AltaVista and search for a topic that interests you. Look at the related queries. Are any of these useful to extend your search more widely?

Another possibility is to extend your search from a page that has proved useful. For example, Google offers a 'similar page' search.

[www.google.com]

Another approach is to make use of the directory's categories. For example, a hit found from a full-text search that also appears in the directory may be shown with a link to the appropriate directory category. By following this link you can see pages that researchers have put in the same category.

[www.yahoo.com]

Search sites are continually looking for ways in which to improve the quality of their results, so we can expect new techniques to appear.

Activity 30

You have learnt some of the techniques used by search engines. How can you use that knowledge to help you find information you want?

T180_5

Take your learning further

Making the decision to study can be a big step, which is why you'll want a trusted University. The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now. Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level Access courses and Certificates.

Not ready for University study then browse over 900 free courses on OpenLearn and sign up to our newsletter to hear about new free courses as they are released.

Every year, thousands of students decide to study with The Open University. With over 120 qualifications, we’ve got the right course for you.

Request an Open University prospectus