Skip to content
Science, Maths & Technology

The research and modelling of COVID-19

Updated Monday, 17th May 2021

Dr Kaustubh Adhikari, Lecturer in Statistics at The Open University, interviews Dr Sayantan Banerjee, a statistician from IIM-Indore.

Photo of Sayantan Banerjee Copyrighted  image Icon Copyright: Copyright of Sayantan Banerjee Dr Sayantan Banerjee Dr. Sayantan Banerjee is currently working as an Assistant Professor in the Operations Management & Quantitative Techniques Area at the Indian Institute of Management, Indore (IIM-Indore).

He has a PhD in Statistics from North Carolina State University and worked as a post-doctoral fellow at the University of Texas MD Anderson Cancer Center before joining IIM-Indore.

His primary research interests include Bayesian inference, high-dimensional models and graphical models. 


Kaustubh: Can you tell us briefly how you are modelling the COVID-19 epidemic?

Sayantan: We use a network-based Susceptible-Infected-Recovered (Network SIR) model for the forecasting procedure. Such models are being widely used in forecasting for COVID-19 in all countries, including the UK, although the specific parameters would change depending on the country’s situation.

We first create a national-level network of the infected patients, with the patients as nodes, and two nodes being connected if their locations of detection are within both 3 degrees of latitude and longitude of one another. Now, for each state (an administrative division of India, similar to a county in the UK), we extract the subset of this network falling in that state (determined by the location of detection of the patients), and that becomes the initial infected network of the state.

The complete initial network is constructed by adding additional susceptible nodes so that the number of nodes matches the population of the state. We then simulate the growth of this network under specified parameters.

We also look at the healthcare capacity in India across different states and analyse the overall availability of hospital beds in India. The levels of healthcare resources required will vary based on the different clinical outcomes of the infection, and the existing availability of healthcare infrastructure.

We capture these aspects using a compartmental epidemiological model based on the classical Susceptible-Exposed-Infected-Recovered (SEIR) model, which describes the spread and clinical progression of COVID-19. The infection levels are broadly categorised as mild, severe and critical, and we focus on the medium intensity spread scenario with the reproduction number (R0, pronounced as R-nought) equal to 2 under intervention.

Kaustubh: What are the main observations and recommendations from your model?

Sayantan: From the model forecasts, we can make predictions about how the COVID-19 situation is likely to play out in India. As before, forecasts are likely to be similar in other countries as well, while the time of the peak and its extent will depend on each country’s situation.

We expect that most of the states will see a significant rise in the number of cases, starting around mid-May. Clear hot-spots are emerging in the first cluster of states, with a second cluster emerging close behind. While a number of these initial hotspots could be traced to international travel and port of entries, some newly formed hotspots have emerged which can be potentially attributed to hubs of contact networks through travels and public gatherings.

Although recently, over the last few days, the state of Kerala has been reporting a significant drop in the infections with a higher recovery rate, thus giving initial signals of success in bringing the reproduction number much below the national average.

Our results indicate that there is an immediate need for the administrators to mobilise resources and infrastructure in hotspot areas and acquire the appropriate number of hospital beds (permanent or makeshift), ventilators, personal protective equipment, and the accompanying personnel to support the huge surge which lies ahead.

We will need to continue to leverage the massive mobile phone network to ensure that the reminders about social distancing, hand-washing & hygiene are sent regularly. Next, we have to ensure that testing is strategic (not possible to test everyone), contact tracing is immediate (once a person is tested positive it is made relatively easy to trace previous contacts and isolate them immediately), and quarantine is smart (right level of quarantine based on the level of risk).

The time is right to take such decisions while the extended lockdown is in effect, and combined with other mitigation strategies, India can significantly flatten the curve.

Kaustubh: What are the various kinds of data that are used in such models?

Sayantan: We use publicly available patient-level data collated from trusted sources including the Ministry of Health and Family Welfare. The data includes patient details like age and gender apart from the demographic location, which helps in constructing the network.

We also obtain data about the hospital beds across the country from the Ministry. Additionally, we obtain the population for each state from the Open Government Data Platform of India.

Kaustubh: Is there something unique about the COVID-19 epidemic, in mathematical terms, which makes it so destructive and widespread compared to earlier viral epidemics such as SARS or Ebola?

Sayantan: Mathematically speaking, the transmission rate of COVID-19 seems to be faster than other viral epidemics. Possible explanations provided by biologists concern the ease of spread (through airborne droplets or surfaces) and the nature of viral load in infected patients so that the transmission starts even before symptoms start to develop. Recent reports by the U.S. Centers for Disease Control and Prevention (CDC) suggest that COVID-19 can be spread even by infected people showing no symptoms. That’s why COVID-19 has been so widespread, but it still has a lower fatality rate as compared to SARS.

Kaustubh: Many different teams of scientists are trying to mathematically model this epidemic. How do the various approaches differ? To what extent do their findings vary? 

Sayantan: There are indeed plenty of mathematical models around this epidemic, mostly based on standard compartmental models for modelling such diseases. These include the SIR, SEIR models to name a few. Some have also considered a more detailed level compartmental model including various stages of the infection and related transmission rates along with rates for asymptomatic transmission.

Our method uses a network-based SIR approach so that the population density of a region or state is taken into account, thus enabling us to make state-wide predictions as well.

Some models that have included the role of interventions (like lockdown, quarantine, testing, isolation) explicitly in their modelling have come up with various scenarios under these interventions and their likely impact on the number of infections.

However, for all these models, we have to understand that the uncertainties are extremely high, primarily because we do not know the actual number of infected people, aided by the problem that this infection can spread via asymptomatic individuals as well.

When applied to India, most of the findings from these models predict a surge in cases in the coming days.

Kaustubh: How important is the reliability of data? There has been substantial criticism of the official reporting of numbers in various countries. Some countries have a severe lack of testing kits so deaths are being attributed to other reasons such as pneumonia. How do such issues with data affect modelling?

Sayantan: The quality and reliability of the available data play a huge and definitive role in any kind of modelling.

The response of various states of India has been different in this context, with some states not reporting the location of the patients (for example, West Bengal), thus making the job more challenging.

To construct the underlying network for the state, and enable contact tracing, it is of supreme importance to have the location information. This also helps in identifying hotspots and coming up with effective mitigation plans.

Kaustubh: You are working on modelling this epidemic in India. Are there particular challenges when working in a developing country?

Sayantan: Though a developing country, the sheer size of the population of India is both a bane and a boon.

While the large population density has its disadvantage of making contact tracing and isolation a difficult task, India leverages on its young population and strong information network to fight the disease with grit. India has taken several pro-active measures in a bid to mitigate the crisis, both from the healthcare and economic front. The local and central administrations are taking regular advice from experts in various fields as well.

The containment plan chalked out by the Ministry deserves much applause and will strive to ward off this crisis.

 Read our interview with science integrity expert Dr Elisabeth Bik, where she shares her experiences and concerns around the fake news and fake research around COVID-19. Click here.
 To watch a thorough discussion and Q&A session on COVID-19 with experts from the OU STEM faculty, click here.
• If you are interested in the numbers around COVID-19, such as the meaning for the various terms, the modelling that's being used to obtain estimates, or the reliability of the statistics, then head over to ‘A statistician’s guide to coronavirus numbers’ by the Royal Statistical Society.






Related content (tags)

Copyright information

For further information, take a look at our frequently asked questions which may give you the support you need.

Have a question?