2.2.1 Pattern match

The Pattern match question type is used to test if a short free-text student response matches a specified response pattern.

Pattern match is a more sophisticated alternative to the Short answer question type and offers:

  • the ability to cater for misspellings, with and without an English dictionary
  • specification of synonyms and alternative phrases
  • flexible word order
  • checks on the proximity of words.

For certain types of response it has been shown to provide an accuracy of marking that is on a par with, or better than, that provided by a cohort of human markers.

Pattern match works on the basis that you have a student response which you wish to match against any number of response matching patterns. Each pattern is compared in turn until a match is found and feedback and marks are assigned.

The key to using Pattern match is in asking questions that you have a reasonable hope of marking accurately. Hence writing the question stem is the most important part of writing these questions.

2.21 The Question text, Grade and General feedback for a Pattern match question.

Question name: A descriptive name is sensible. This name will not be shown to students.

Question text: You may use the full functionality of the editor to state the question.

Placing the response box

The response box may be placed within the question rubric by including a sequence of underscores e.g. _____. At runtime the response box will replace the underscores. The minimum number of underscores required to trigger this action is 5. The size of the input box may also be specified by __XxY__ e.g. __20x1__ will produce a box 20 columns wide and 1 row high. If no underscores are present the response box will be placed after the question.

Default mark: Decide on how to score your questions and be consistent.

General feedback: We recommend that all questions should have this box completed with the correct answer and a fully worked explanation. The contents of this box will be shown to all students irrespective of whether their response was correct or incorrect. For the Pattern match question type you cannot rely on using the machine generated 'Right answer' (from the iCMA definition form).

2.22 Options for entering answers

Case sensitivity: No or yes.

Allow use of subscript/superscript: No or Yes. Any subscripts entered by the student are contained in their response between the standard tags of and or and . For example:

  • The formula for water is H2O
  • The speed of light is approximately 3x108 m s-1

At run time keyboard users may move between normal, subscript and superscript by using the up-arrow and down-arrow keys. You may wish to include this information within your question. If you do please note that the up-arrow and down-arrow provided in the HTML editor’s ‘insert custom characters’ list are not spoken by a screen reader and you should also include the words ‘up-arrow’ and ‘down-arrow’.

If answer is more than 20 words: We strongly recommend that you limit responses to 20 words. Allowing unconstrained responses often results in responses that are both right and wrong – which are difficult to mark consistently one way or the other.

Check spelling of student: How many ways do you know of to spell ‘temperature’? We’ve seen 14! You will improve the marking accuracy by insisting on words that are in Moodle system dictionary.

Add these words to dictionary: When dealing with specialised scientific, technical and medical terms that are not in a standard dictionary it is most likely that you will have to add them by using this field. Enter your words leaving a space between them.

Convert the following characters to space: In Pattern match words are defined as sequences of characters between spaces. The exclamation mark and question mark are also taken to mark the end of a word. The period is a special case; as a full stop it is also a word delimiter but as the decimal point it is not. All other punctuation is considered to be part of the response but this option lets you remove it.

2.23 Fields for entering synonyms (not associated with the SI units question)

Words and synonyms: All words and synonyms are specified as they are to be applied by the response matching. They do not have to be full words but can be stems with a wildcard.

Synonyms may only be single words i.e. the ability to specify alternate phrases in synonym lists is not allowed.

From the example above any occurrence of the word oil in the response match will be replaced by oil|glycer*|paraf* before the match is carried out.

2.24 A response match

Answer: This example shows that Pattern match will support complex response matching.

Take the first answer field match_o(ampere candela kelvin mole) is the exact match for the four words with the additional feature that the matching option 'o' allows the words to be given in any order.

The answer field match_ow(ampere candela kelvin mole) requires the same four words, again in any order, but also allows other words.

The third answer field match_mow(ampere candela kelvin mole) allows for misspellings which are still in the dictionary e.g. mule instead of mole.

Please see the section on Pattern match syntax for a full description.

Grade: Between ‘none’ and 100%. At least one response must have a mark of 100%.

Feedback: Specific feedback that is provided to anyone whose response is matched by the response matching rule in Answer.

2.25 Feedback to all non-matched responses

The feedback for all non-matched responses should go into the ‘Any other answer’ field.

2.26 The multiple tries section

________________________________________________________________________

2.27

The basic unit of the student response that Pattern match operates on is the word, where a word is defined as a sequence of characters between spaces. The full stop (but not the decimal point), exclamation mark and question mark are also treated as ending a word.

Numbers are special instances of words and are matched by value and not by the form in which they are given. match_w(25 ms-1)will match the following correct responses; 25 ms-1, 2.5e1 ms-1, 2.5x10-1 ms-1

With the exception of numbers and the word terminators (? and !) Pattern match matches what it is given. Whether case does or does not matter is left to the author to decide as is the significance of punctuation such as ,;: and others.

The response is treated as a whole with the exception that words that are required to be in proximity must also be in the same sentence.

2.28

Pattern match syntax

The match syntax can be considered in three parts.

  1. the matching options e.g. mow
  2. the words to be matched e.g.tom dick harry together with the in-word special characters.
  3. and, or and not combinations of matches e.g. match_any()
2.29
2.1 The matching options
Matching optionSymbolDescription
allowExtraCharscExtra characters can be anywhere within the word. Authors are expected to omit allowExtraChars when using the misspelling options below.
allowAnyWordOrderoWhere multiple words are to be matched they can be in any order.
allowExtraWordswExtra words beyond those being searched for are accepted.
misspelling: allowReplaceCharmrWill match a word where one character is different to those specified in the pattern. The pattern word must be 4 characters or greater, excluding wildcards, for replacement to kick in. Authors are expected to omit allowExtraChars when using this option.
misspelling: allowTransposeTwoCharsmtWill match a word where two characters are transposed. The pattern word must be 4 characters or greater, excluding wildcards, for transposition to kick in. Authors are expected to omit allowExtraChars when using this option.
misspelling: allowExtraCharmxWill match a word where one character is extra to those specified in the pattern. The pattern word must be 3 characters or greater, excluding wildcards, for extra to kick in. Authors are expected to omit allowExtraChars when using this option.
misspelling: allowFewerCharmfWill match a word where one character is missing from those specified in the pattern. The pattern word must be 4 characters or greater, excluding wildcards, for fewer to kick in. Without this 'no' would be reduced to just matching 'n' or 'o'. Authors are expected to omit allowExtraChars when using this option.
misspellingmThis combines the four ways of misspelling a word described above i.e. m is equivalent to mxfrt. Authors are expected to omit allowExtraChars when using this option.
misspellingsm2Allows two misspellings, as defined by option 'm', in pattern words of 8 characters or more, excluding wildcards. Authors are expected to omit allowExtraChars when using this option. N.b. use this option sparely. It introduces a huge number of possible acceptable spellings all of which have to be checked and the time taken for the match to complete goes up and up.
allowProximityOf0p0No words, or full stops, are allowed in between any words specified in the proximity sequence.
allowProximityOf1p1One word is allowed in between any two words specified in the proximity sequence. The words must not span sentences.
allowProximityOf2p2(Default value) Two words are allowed in between any two words specified in the proximity sequence. The words must not span sentences.
allowProximityOf3p3Three words are allowed in between any two words specified in the proximity sequence. The words must not span sentences.
allowProximityOf4p4Four words are allowed in between any two words specified in the proximity sequence. The words must not span sentences.

The matching options are appended to the word match with an intervening underscore and may be combined. A typical match combines the options ‘mow’ to allow for

  1. misspellings
  2. any word order
  3. any extra words

and is written match_mow(words to be matched).

________________________________________________________________________

2.30

Within a word 'special characters' provide more localised control of the patterns.

2.2 Special characters
Special characterSymbolDescription
Word ANDspace'space' delimits words and acts as the logical AND.
Word OR|| between words indicates that either word will be matched. | delimits words and acts as the logical OR.
Proximity control_Words must be in the order given and with no more than n (where n is 0 - 4) intervening words. All words under the proximity control must be in the same sentence. _ delimits words and also acts as logical 'AND'. Other words included in the match that are not under proximity control must be outside the words under the proximity control in the response for them to be matched.For example: 'match(abcd_ffff ccc)'does match the response 'abcd ffff ccc' but does not match the response 'abcd ccc ffff'
Word groups[ ][ ] enables multiple words to be accepted as an alternative to other single words in OR lists. [ ] may not be nested. Single words may be OR'd inside [ ]. Where a word group is preceded or followed by the proximity control the word group is governed by the proximity control rule that the words must be in the order given.
Single character wildcard?Matches any single character.
Multiple character wildcard*Matches any sequence of characters including none.

It is possible to match some of the special characters by ‘escaping’ them with the ‘\’ character. So match(\|) will match ‘|’. Ditto for _, [, ], and *. And if you wish to match round brackets then match(\(\)) will match exactly ‘()’.

________________________________________________________________________

2.31
2.3 Combining matches
match_all()All matches contained within the brackets must be true.match_all(
  • match_mow(first)
  • match_mow(second)
)
match_any()Just one of the matches contained within the brackets must be true.match_any(
  • match_mow(first)
  • match_mow(second)
)
not()The match within the brackets must be false.not (
  • match_any(
    • match_mow(first)
    • match_mow(second)
  • )
)

match_all(), match_any()and not() may all be nested.

________________________________________________________________________

2.32
2.4 Pattern match examples
Student responseMatching optionsPattern matchmatch method return
tom dick harryemptytom dick harryTrue. This is the exact match.
thomasctomTrue. Extra characters are allowed anywhere within the word.
tom, dick and harrywdickTrue. Extra words are allowed anywhere within the sentence.
harry dick tomotom dick harryTrue. Any order of words is allowed.
rickmdickTrue. One character in the word can differ.
rick and harry and tommowtom dick harryTrue.
dick and harry and thomascowtom dick harryTrue.
arthur, harry and sidmowtom|dick|harryTrue. Any of tom or dick or harry will be matched.
tom, harry and sidmowtom|dick harry|sidTrue. The pattern requires either tom or dick AND harry or sid. Note that 'tom,' is only allowed because m allows the extra character, the comma, in 'tom,'.
tom was mesmerised by maudmow[tom maud]|[sid jane]True. The pattern requires either (tom and maud) or (sid and jane).
rickempty?ickTrue. The first character can be anything.
haroldemptyhar*True. Any sequence of characters can follow 'har'.
tom married maud, sid married jane.mowtom_maudTrue. Only one word is between tom and maud.
maud married tom, sid married jane.mowtom_maudFalse. The proximity control also specifies word order and over-rides the allowAnyWordOrder matching option.
tom married maud, sid married jane.mowtom_janeFalse. Only two words are allowed between tom and jane.
tom married maudmowtom|thomas marr* maudTrue.
maud marries thomasmowtom|thomas marr* maudTrue.
tom is to marry maudmowtom|thomas marr* maudTrue.
tempraturm2owtemperatureTrue. Two characters are missing.
temporaturm2owtemperatureTrue. Two characters are incorrect; one has been replaced and one is missing.

________________________________________________________________________

2.33

How can you possibly guess the multiplicity of phrases that your varied student cohort will use to answer a question? Of course you can’t, but you can record everything and over time you will build a bank of student responses on which you can base your response matching. And gradually you might be surprised at how well your response matching copes.

Before we describe the response matching it’s worth stressing:

  1. The starting point is asking a question that you believe you will be able to mark accurately.
  2. How you phrase the question can have a significant impact.
  3. Pattern match works best when you are asking for a single explanation that you will mark as simply right or wrong. You will find that dealing with multiple parts in a question or apportioning partial marks is a much harder task that will quickly turn into a research project.
  4. A bank of marked responses from real students is an essential starting point for developing matching patterns. Consider using the question in a human marked essay to get your first bank of student responses. Alternatively first use the question in a deferred feedback test where marks are allocated once all responses are received.
  5. This question type possibly more than all others demands that you monitor student responses and amend your response matching as required. There will come a point at which you believe that the response matching is ‘good enough’ and might wonder ‘how many responses might I need to check?’ Unfortunately there is no easy answer to this question. Sometimes 200 responses will be sufficient. At other times you may have to go much further than this. If you do bear in mind that
    1. Asking students to construct their response, as opposed to choosing it from a list, asks more of the student i.e. it is worth your while to do this.
    2. Human markers following a mark scheme are fallible too. Ditto.
  6. Stemming is the accepted method of catering for different word endings e.g. mov* will cater for ‘moved’ and ‘moves’ and ‘moving’ etc.
  7. Students do not type responses that are deliberately wrong. Why should they? This is in contrast to academics who try to defeat the system. Consequently you will not find students entering the right response but preceding it with ‘it is not..’ And consequently you do not have to match deliberate errors.
  8. The proximity control enables you to link one word to another.
  9. It is often the case that marking obvious wrong responses early in your matching scheme will improve your overall accuracy.
  10. Dealing with responses that contain both right and wrong answers requires you to take a view. You will either mark all such responses right or wrong. The computer will carry out your instructions consistently c.f. asking a group of human markers to apply your mark scheme consistently.
  11. You should aim for an accuracy of >95% (in our trials our human markers were in the range 92%-96%). But, of course, aim as high as is reasonable given the usual diminishing returns on continuing efforts.

History

The underlying structure of the response matching described here was developed in the Computer Based Learning Unit of Leeds University in the 1970s and was incorporated into Leeds Author Language. The basic unit of the word, the matching options of allowAnyChars, allowAnyWords, allowAnyOrder and the word OR feature all date back to Leeds Author Language.

In 1976 the CALCHEM project which was hosted by the Computer Based Learning Unit, the Chemistry Department at Leeds University and the Computer Centre of Sheffield Polytechnic (now Sheffield Hallam University) produced a portable version of Leeds Author Language.

A portable version for microcomputers was developed in 1982 by the Open University, the Midland Bank (as it then was; now Midland is part of HSBC) and Imperial College. The single and multiple character wildcards were added at this time.

The misspelling, proximity and Word groups in 'or' lists additions were added as part of the Open University COLMSCT projects looking at free text response matching during 2006 - 2009.

References

Philip G. Butcher and Sally E. Jordan, A comparison of human and computer marking of short free-text student responses, Computers & Education 55 (2010) 489-499

2.2.2 Short answer