4.2 Question statistics
Facility index (F):
The mean score of students on the item.
|F||Interpretation||35 - 64||About right for the average student.|
|5 or less||Extremely difficult or something wrong with the question.||66 - 80||Fairly easy.|
|6 - 10||Very difficult.||81 - 89||Easy.|
|11 - 20||Difficult.||90 - 94||Very easy.|
|20 - 34||Moderately difficult.||95 - 100||Extremely easy.|
Standard deviation (SD):
A measure of the spread of scores about the mean and hence the extent to which the question might discriminate. If F is very high or very low it is impossible for the spread to be large. Note however that a good SD does not automatically ensure good discrimination. A value of SD less than about a third of the question maximum (i.e. 33%) in the table is not generally satisfactory
Random guess score (RGS):
This is the mean score students would be expected to get for a random guess at the question. Random guess scores are only available for questions that use some form of multiple choice. All random guess scores are for deferred feedback only and assume the simplest situation e.g. for multiple response questions students will be told how many an swers are correct.
Values above 40% are unsatisfactory – and show that True/False questions must be used sparsely in summative iCMAs.
The question weight expressed as a percentage of the overall iCMA score.
An estimate of the weight the question actually has in contributing to the overall spread of scores. The effective weights should add to 100% - but read on.
The intended weight and effective weight are intended to be compared. If the effective weight is greater than the intended weight it shows the question has a greater share in the spread of scores than may have been intended. If it is less than the intended weight it shows that it is not having as much effect in spreading out the scores as was intended.
The calculation of the effective weight relies on taking the square root of the covariance of the question scores with overall performance. If a question’s scores vary in the opposite way to the overall score, this would indicate that this is a very odd question which is testing something different from the rest. And the computer cannot calculate the effective weights of such questions resulting in warning message boxes being displayed.
This is the correlation between the weighted scores on the question and those on the rest of the test. It indicates how effective the question is at sorting out able students from those who are less able. The results should be interpreted as follows
|50 and above||Very good discrimination|
|30 – 50||Adequate discrimination|
|20 - 29||Weak discrimination|
|0 - 19||Very weak discrimination|
|-ve||Question probably invalid|
This statistic attempts to estimate how good the discrimination index is relative to the difficulty of the question.
An item which is very easy or very difficult cannot discriminate between students of different ability, because most of them get the same score on that question. Maximum discrimination requires a facility index in the range 30% - 70% (although such a value is no guarantee of a high discrimination index).
The discrimination efficiency will very rarely approach 100%, but values in excess of 50% should be achievable. Lower values indicate that the question is not nearly as effective at discriminating between students of different ability as it might be and therefore is not a particularly good question.