Skip to main content

About this free course

Share this free course

Language in professional life
Language in professional life

Start this free course now. Just create an account and sign in. Enrol on the course to track your learning.

8 Building your own corpus

In her interview, Maggie Charles described how she encourages each of her students to build their own small corpus of texts within their discipline area so that they can explore how language is used. This enables students to be researchers and investigate language in use for themselves. In this section you will hopefully see how building a corpus isn’t as difficult or scary as it might sound. In fact, it’s possible to put together your own small corpus in under an hour. The hard part is deciding what you want to find out, and therefore what kinds of texts to use and what searches to carry out. The following activity shows you how to build a corpus of your own texts.

Activity 6 A corpus of your own writing

Timing: Allow about 1 hour 30 minutes

This activity shows how you can compile a corpus of your own texts. These could be essays if you are a student, but you could of course choose to build a corpus of your letters, emails, blogposts or any other form of writing you have available in electronic form. Using your own texts means you don’t need to worry about copyright permission.

As a guide, begin with five or six texts of at least 100 words each, though note that it isn’t any harder to build the corpus if the texts are longer.

Here’s a step-by-step guide to creating and searching your own corpus:

  1. First you need to open each text and save it as a ‘plain text’ file. This is simply a form that corpus software can read. To do this on a PC using MS Word, click ‘File’, ‘Save as’ and in the drop-down list next to ‘save as type’ choose the option ‘plain text’ or ‘text’ (it’ll end in the extension *.txt).
  2. Save all the plain text documents in one folder and label this ‘my corpus’ (or something meaningful to you). Make sure you know how to navigate to the folder. You have now successfully built a corpus.
  3. You need to have software to search your corpus. You could use AntConc for this as it’s free and there are lots of helpful guidance on Lawrence Anthony’s website. You could use your corpus to find out what words and phrases you often use.

If you’d like to go further in using your corpus, you could follow up with some of the resources in the ‘Further exploration’ section.