Science, Maths & Technology

Confusing Terms In Statistics

Updated Monday 10th January 2005

Kevin McConway explains why, for a statistician, reliable and significant have specific meanings

Pills Copyrighted image Icon Copyright: BBC

Most professions have a tendency to use jargon that is impenetrable to outsiders. Statisticians are no exception to this general rule. If you look in a statistics textbook, you’ll probably be unlucky enough to find words like ‘kurtosis’ and ‘heteroscedasticity’ (unless it’s an American book, when they’ll spell it ‘heteroskedasticity’).

Don’t worry, I’m not going to explain what these mean; but statisticians display another kind of jargon use that can be even more confusing. We use everyday words, but give them special meanings that differ from the meaning in everyday use. Sometimes the difference in meaning is small, sometimes it is large. A full list of such words would be rather long.

It would include: bias, block, bootstrap, censored, contrast, deviance, deviation, distribution, error, expected, hazard, improper, influence, information, jack-knife, kernel, leverage, likelihood, mode, model, moment, moral, normal, pie, regression, scree, stress, tail, variance; and many others.

Actually, this is not as confusing as you might think. With some of the words, the technical statistical meaning is so close to the everyday meaning that no important confusion is likely to arise. Others are generally used in a technical context, so that it is clear that they are not being used in their everyday sense.

However, this is not always the case. I want to describe two words, each of which has a well-understood everyday meaning, and a technical statistical meaning that is rather different. Both of these words are used in contexts where it may not be clear whether they have the everyday meaning or the technical meaning. The two words are ‘significant’ and ‘reliable’.

Significant

First, ‘significant’. Its statistical meaning is a little complicated. Suppose I’ve invented a new pill that is supposed to cure headaches. I’ve tried giving it to a few people with headaches and most of them got better, but I know that headaches often go away on their own, and I know that there are already some pretty effective headache cures around.

So I decide to do an experiment. I get a group of volunteers who all have headaches. I choose half of them at random and give them my new pill. I give the other half a standard dose of aspirin. After an hour I ask all of them whether their headache has gone away, and I record the results. I find that more of the people who took my new pill got better than did the people who took aspirin.

Does that mean my pill works? Well, it might, or it might not. Perhaps, by chance, the group to whom I gave the new pill happened to include more people whose headache would have got better anyway, whatever they had taken. But I can do a calculation that will throw light on this possibility.

I can calculate what’s known as a P value, which is a kind of probability. The smaller this P value is, the less likely it is that my results are simply due to chance. (The connection between the P value and the likelihood that my results are due to chance is a little complicated, but let’s ignore that detail.)

If my P value is small enough, I conclude that my new pill probably does work better than aspirin. In the jargon, I would say that the difference between the two groups is statistically significant.

There’s already a potential cause for confusion here, in that it’s a small P value that gives a significant result (in the statistical sense), not a big P value as you might intuitively expect. Most statistics students get confused over this at some point.

But, the important thing to remember is this: if I say a difference is statistically significant, all I mean is that ‘we can pretty well rule out the possibility that the result is due to chance alone.’

In its everyday use, ‘significant’ means ‘having a meaning or importance’. But a difference that is statistically significant might actually have very little importance in a practical sense.

Suppose I’d actually done my headache pill experiment on a huge group of people. The experiment might indicate that my new pill only cures, let’s say, one more headache in a thousand than aspirin does, but even a small difference like this may be statistically significant if the number of people in the experiment is large enough.

If my new pills cost, let’s say, a hundred times as much as aspirin, this very slightly increased performance may be of no practical significance at all, even if the difference is statistically significant. It could also happen that a result is not statistically significant but still has practical significance — we can’t rule out the possibility that the result is just due to chance, but it might indicate the need for a bigger experiment.

Significant or reliable? A cat on scales by Alasam under Creative Commons Creative commons image Icon alasm via Flickr under Creative-Commons license
Reliable? Can this cat trust the bathroom scales? [Image by alasam under CC-BY-NC-ND licence]

 

Reliable

Now let’s turn to ‘reliable’. Suppose I have a way of measuring something. Many measuring techniques won’t always give you exactly the same result if you repeat the measurement again. But, in the statistical sense, a measuring method is said to be reliable if it tends to give similar numbers when you repeat the measurement.

At home I’ve got a very accurate balance for weighing things, and I also have a set of bathroom scales that is rather old. If I weigh an object on the balance, and then weigh it again, I might not get exactly the same result, but I know that the two results will vary by only a small amount..

However, if I weigh something twice on my old bathroom scales, the results will differ more, on average. So, in the statistical sense, the balance is more reliable than the bathroom scales.

However, that doesn’t necessarily mean that the balance will give more accurate results. On both weighing devices you can set the zero; in other words you can see what reading they give when nothing is on the weighing surface, and adjust this reading so it is zero.

Suppose I forgot to do this with the balance, and it actually read 200 grams with nothing on its weighing surface. Then I weigh the same thing several times. I would still get more or less the same result every time, but the recorded weights would all be about 200g too big. Now suppose I do the same thing with the bathroom scales, but adjusting the zero properly before I began.

The results of repeated weighings will still differ more on the bathroom scales than on the balance, so the balance is still more reliable in the statistical sense. But, on average, the results from the bathroom scales might be nearer to the true weight of the object.

The clear message is that reliability, in the statistical sense, is not the only thing we should be concerned about. Validity, or lack of bias, are important too. These terms refer to how close the average result of the measuring process is to the true value of what’s being measured.

With the zero set wrongly on the balance and right on the bathroom scales, measurements made on the balance are more reliable but less valid than those on the bathroom scales. These ideas of reliability and validity are used with social and psychological measurements rather more than with physical measurements, but the basic ideas are the same.

The upshot is that a measurement which is reliable in the statistical sense might not be reliable in the everyday sense — you might not want to rely on it or trust it, if its validity is not great enough.

So, if you hear someone talking about a result being ‘significant’ or ‘reliable’, you must always try to clarify the sense in which the words are being used. Maybe they aren’t saying quite what it sounds as if they are saying!

 

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?

Other content you may like

An introduction to complex numbers Copyrighted image Icon Copyright: Used with permission free course icon Level 3 icon

Science, Maths & Technology 

An introduction to complex numbers

In this free course, An introduction to complex numbers, you will learn how complex numbers are defined, examine their geometric representation and then move on to looking at the methods for finding the nth roots of complex numbers and the solutions to simple polynominal equations.

Free course
16 hrs
Using vectors to model Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Using vectors to model

This free course, Using vectors to model, introduces the topic of vectors. The subject is developed without assuming you have come across it before, but the course assumes that you have previously had a basic grounding in algebra and trigonometry, and how to use Cartesian coordinates for specifying a point in a plane.

Free course
16 hrs

Science, Maths & Technology 

Developing modelling skills

This free course, Developing modelling skills, is the third in the series of five courses on mathematical modelling. It provides an overview of the processes involved in developing models, starting by explaining how to specify the purpose of the model. It then moves on to look at aspects involved in creating models, such as simplifying problems, choosing variables and parameters, formulating relationships and finding solutions. You will also look at interpreting results and evaluating models. This course assumes that you have previously studied the courses Modelling pollution in the Great Lakes and Analysing skid marks.

Free course
4 hrs
Exploring distance time graphs Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

Exploring distance time graphs

Graphs are a common way of presenting information. However, like any other type of representation, graphs rely on shared understandings of symbols and styles to convey meaning. Also, graphs are normally drawn specifically with the intention of presenting information in a particularly favourable or unfavourable light, to convince you of an argument or to influence your decisions. This free course, Exploring distance time graphs, will enable you to explain, construct, use and interpret distance-time graphs.

Free course
12 hrs
Beating the bookies: The maths of a World Cup 2010 win Copyrighted image Icon Copyright: Timsnell under CC-BY-ND licence article icon

Science, Maths & Technology 

Beating the bookies: The maths of a World Cup 2010 win

The World Cup is a statistician's dream - but can you use maths to break a bookie's bank and heart?

Article
Modelling static problems Copyrighted image Icon Copyright: Used with permission free course icon Level 2 icon

Science, Maths & Technology 

Modelling static problems

This free course, Modelling static problems, lays the foundation of the subject of mechanics. Mechanics is concerned with how and why objects stay put, and how and why they move. In particular, the course considers why objects stay put. And it assumes that you have a good working knowledge of vectors.

Free course
16 hrs
Beating The Bookies? Copyrighted image Icon Copyright: BBC article icon

Science, Maths & Technology 

Beating The Bookies?

The Ever Wondered team gave financial guru Alvin Hall a fiver and sent him off to a greyhound track to explore how you can use numbers to shorten the odds

Article
Mathematical striptease Copyrighted image Icon Copyright: The Open University video icon

Science, Maths & Technology 

Mathematical striptease

Does maths get your knickers in a twist? Watch this video to see how maths can hold the solution to some everyday problems

Video
5 mins
Maths everywhere Copyrighted image Icon Copyright: Used with permission free course icon Level 1 icon

Science, Maths & Technology 

Maths everywhere

This free course, Maths everywhere, explores reasons for studying mathematics, practical applications of mathematical ideas and aims to help you to recognise mathematics when you come across it. It introduces the you to the graphics calculator, and takes you through a series of exercises from the Calculator Book, Tapping into Mathematics With the TI-83 Graphics Calculator. The course ends by asking you to reflect on the process of studying mathematics. In order to complete this free course you will need to have obtained a Texas Instruments TI-83 calculator and the book Tapping into Mathematics With the TI-83 Graphics Calculator (ISBN 0201175479).

Free course
8 hrs