Imagine you are a doctor faced with a mystery illness.
Your patient’s symptoms are getting worse, and none of the usual treatments are working. Normally, you would identify the culprit by growing the pathogen in the lab and running standard tests. But not all pathogens can be grown this way, including the one causing this illness. You are running out of options.
This was the case for Ellie Irwin, who suffered from persistent inflammation and blurred vision in her right eye. Here is how a developing area of life sciences, called bioinformatics, helped save her sight using a technique called metagenomics where more than species genome is analysed at the same time.
What is bioinformatics?
Bioinformatics is the use of information technology to solve biological problems. One important branch focuses on analysing DNA sequences to understand which organisms are present in a sample and what they might be doing.
What is DNA and why does it matter?
DNA is the genetic material that acts like a blueprint for every living organism. Each species has its own characteristic DNA sequence, made up of four chemical “letters”: A, T, C and G.
Scientists can extract DNA from a sample such as blood, saliva or tissue and then “read” the sequence of these letters. This process is called DNA sequencing.

How long does DNA sequencing take?
Originally, DNA sequencing was expensive and slow: sequencing the first human genome took more than 10 years. Today, thanks to next generation sequencing (NGS), it is possible to sequence an entire human genome in around a day. The best speed of 5 hours and 2 minutes has recently been broken by a Guinness World Record™ of 3 hours and 57 minutes.
How does this help diagnose infections?
By comparing DNA sequences from a patient’s sample, that will contain cells from the pathogen as well as patient, with a database of known sequences, researchers can identify the exact species causing an infection. Infections can be caused by bacteria, viruses, protists or fungi.
The full set of DNA in an organism is called its genome, and this can be very large. To make identification faster, scientists often use a short stretch of DNA that is unique to each species. This is called a DNA barcode.
A useful way to think about this is to imagine items in a supermarket. Each item has a unique barcode that tells the till exactly what it is. In a similar way, each species has short stretches of DNA that work like a barcode. If we can read that DNA barcode, we can match it to a database and find out which species it belongs to.
This short region of DNA is often called a barcode region. It is usually very similar between individuals of the same species, but different enough between species to help identify them. Once this region has been sequenced, it is compared with a reference database of known barcodes which identify individual species. If there is a good match, we can say which species the DNA most likely came from.
How are bacterial species identified?
In Bacteria scientists use a gene called 16S rRNA (16S ribosomal RNA), is used as a species barcode. This gene is found in almost all bacteria. Parts of it are very similar across different bacteria, while other parts vary more. This mix of conserved and variable regions makes 16S rRNA an excellent barcode for identifying bacterial species.
The DNA sequences from this region are then compared with a database of known bacterial genomes. Tiny differences in the sequence act as a barcode that can be used to identify which bacteria are present in the patient sample.

This approach, which looks at mixtures of DNA from many organisms at once, is known as metagenomics.
Metagenomics in action: solving a medical mystery
Metagenomics is especially useful when someone has clear signs of infection, but standard tests cannot find the cause. In these cases, metagenomics can act as a powerful catch-all test for bacterial DNA.
A BBC News story described how this approach was used in Ellie’s case. Ellie, a young doctor, had constant inflammation in her right eye that did not respond to normal treatment. Over time she developed a cataract and came close to losing her sight. Another doctor suggested sending a sample from her eye for metagenomic testing at Great Ormond Street Hospital, one of the few NHS centres equipped for this type of analysis.
By sequencing all the DNA in the eye sample and comparing it with a database, the team identified a rare strain of Leptospira, a bacterium usually found in parts of South America. When this result was combined with Ellie’s travel history, which included time in the Amazon region, doctors were able to prescribe the correct antibiotics. The infection cleared and her vision returned.

Why DNA databases matter
This kind of success is only possible because the Leptospira strain had already been detected in the wild, sequenced and added to a reference database. Metagenomics can only identify what is already in the database. If a completely new or very rare pathogen is present and its DNA is not yet recorded, the patient’s sequences will not find a good match and the infection may remain unexplained.
This is why collecting and sequencing DNA from as many species as possible, all over the world, is so important. Every new genome added to a database makes metagenomics a stronger tool for diagnosing infections and for understanding the living world.
Beyond hospitals: wider uses of metagenomics
Metagenomics is not only used in hospitals. It can also:
- monitor viruses in wastewater to track outbreaks
- survey animal and plant life in lakes, rivers and soils
- study the microbes that live on and in our bodies, even after death
Researchers have even used metagenomics to study the thanatomicrobiome, the community of microbes that live on a body after death.
If you would like to explore these ideas further, you can read the BBC article mentioned above and other OpenLearn resources on DNA, genomes and bioinformatics.
Rate and Review
Rate this article
Review this article
Log into OpenLearn to leave reviews and join in the conversation.
Article reviews