How Data Is Informing The Development Of Solutions To The COVID-19 Pandemic


The fight to slow the corona virus pandemic is underpinned by a range of scientific disciplines, including mathematics, biostatistics and data science, as well as immunology, epidemiology and a variety of others. It is this scientific rigour that helps our key decision makers try to determine the best course of action in such difficult and uncertain times.

We are all now exceedingly aware that data is being used in the debate and action on COVID-19. There are a range of data sets available that track the number of cases, deaths and recoveries from the virus. One of the most popular is the John Hopkins University global coronavirus resource centre, where anyone can access the historical data from geographical locations around the globe. These data sets can be used in a number of ways to provide more information on the pandemic. For example, data can be used to predict the trajectory of the spread and mortality rates of COVID-19 across the globe. These simulation models can also be used to factor in Government interventions, such as social distancing and shutdowns, to predict how these measures can affect the transmission of the virus over time. Maths educator Grant Sanderson (creator of 3blue1brown) has posted a great video showing how epidemics can be simulated, thereby indicating how they may spread, as well as the potential impact various measures may have on controlling the disease. Further, work done by the University of Sydney has modelled the impact of various intervention strategies with varying levels of compliance on the spread of disease. Notably, school closures were not found to bring decisive benefits.

3blue1brown Simulating an Epidemic, YouTube

Predicting Spread and Risk

Computational simulations using existing data sets allows scientists to determine what the exponential growth of the virus may look like. Whilst it is difficult to know the total number of cases due to a lack of extensive testing and cases going undetected, Associate Professor Ben Phillips from the University of Melbourne is using data from the John Hopkins site to predict what the total number of cases is likely to be. This work estimates that the actual number of cases could be as much as four times higher than what is reported through testing. Understanding the total scale of the problem is important to inform our decision makers about the true extent and effect of the virus. But it must be noted that there are a large number of assumptions in modelling such as this. For example, the modelling used by A/Prof Phillips assumes there is widespread community transmission, therefore his prediction on the 22nd March for the 1st April numbers in Australia is much higher than what is actually being reported today. This is likely due to the social distancing measures taken, and the reduction in community transmission.

AI is also being used to predict where an outbreak in the virus may occur. By analysing millions of text based articles on the virus, from new reports to social media posts, data scientists can use techniques to predict the locations of outbreaks. Canadian startup Bluedot has proprietary algorithms that allow us to track and anticipate infectious disease risk using over 100 different datasets, including global air travel and disease surveillance. However, in using AI across these datasets, it is important to ensure you have accounted for variations in the data being used. This was shown to be a key downfall of Google’s 2015 Flu Trend initiative.

Data Science as a Treatment Tool

Data science is not only proving useful in modelling the trajectory of the virus and potential outbreaks, it is fast becoming a valuable tool in developing treatments and cures for the virus. To assist in the fight against COVID-19, the White House has released a Call to Action to the “Nation’s artificial intelligence experts to develop new text and data mining techniques that can help the science community answer high-priority scientific questions related to COVID-19”.  The data science community now has access to an open source data set of scholarly literature about COVID-19 which is machine readable. The hope is that data scientists can apply artificial intelligence (AI), machine learning (ML), natural language processing (NLP), for example, to the data to generate insights on the virus and help scientists answer a range of challenging questions. Having access to these open data sets allows data scientists to apply innovative techniques to analyse data in a way that humans could never do, especially at the speed needed to halt this pandemic.

Data science is also helping in the acceleration of treatments into COVID-19. The US company EVQLV, a startup creating “algorithms capable of computationally generating, screening and optimizing hundreds of millions of therapeutic antibodies” is using ML to rapidly screen a vast array of therapeutic antibodies to determine which ones will have a high probability of success. By using ML in such a way it significantly fast-tracks to drug selection process, meaning clinical trials on suitable drug candidates can start sooner.

On the Home Front

Back in Australia, our policy makers are considering how data can be used to assist in Australia’s fight against the virus. For the past three weeks Australia’s Chief Scientist Dr Alan Finkel has been involved in weekly meetings with international science leaders to share ideas and determine areas of focus. Through these discussions “The dialogue has also provided Australia with the opportunity to connect to modellers in the US studying intensive care unit needs which will assist our planning for the coming weeks and months.” The focus this week has also echoed the US’s Call to Action for AI researchers. In response the Australian Government has increased spending on data analytics from $1M to $5M with Quantium Health.

Researchers and companies across Australia are now considering how they can use their data science expertise to help in the fight against COVID-19. And indeed some are turning their attention to the health implications on populations of not only contracting the virus, but also the measures put in place to curb its spread. Curtin University’s CIVIC study is collecting data to investigate the cardiovascular and health impact of COVID-19 exposure in the community, including the impact of social isolation.

In finding solutions to address the COVID-19 pandemic, we must ensure we come together as a science community and collaborate, bringing together our collective expertise.

If you have an idea of how you would like to contribute data science expertise to helping the COVID-19 fight, then please get in touch with us here at the WA Data Science Innovation Hub and we’ll be happy to provide advice and connections in the WA community.

Stay safe everyone and keep your distance – the science tells us it will work!

Dr Liz Dallimore