Horizon CDT Research Highlights

Research Highlights

Diagnosing Disease with Shopping Data

  Elizabeth Dolan (2019 cohort)

The aim:

To create a framework for “Personal Data Donation” by investigating the issues surrounding individuals “donating” personal transactional data to public health research projects.

The question:

How can personal transactional data be collected and analysed for the purposes of health research in a way that is acceptable to society, works for infectious and chronic disease, and can be successfully implemented in a clinical setting?


This PhD is connected to a wider project by partners ALSPAC at Bristol University(2020) and the Alan Turing Institute(2020): “donating personal transactional data for research: investigating the public acceptability of using commercial transactional data in public health research”.

Personal commercial transactional data is the information stored when an exchange occurs between an individual and a business, including customer shopping data.  This research will connect loyalty card data (customer shopping information held by a retailer), to Covid-19 incidents and to information from women with ovarian cancer. Connecting these datasets will be used to investigate whether shopping data can be used to get women with ovarian cancer diagnosed earlier, and/or if it can help in informing public health decisions in a pandemic. 

A collection of studies will be done to iteratively create machine learning (ML) models (a method of programming computers to learn from data) whose predictions could help in the earlier diagnosis of ovarian cancer and/or the understanding of ILI (Influenza Like Illnesses) outbreaks. 

The methodology to be used is mixed methods collecting and analysing both qualitative data, and quantitative data for integrated interpretation.  The studies will be used to inform the models schema creation, feature engineering, to understand, and validate its outputs and any interpretations made from these.  The iterative design will allow for adjustments to the model for successful implementation in a clinical setting.  

The survival rate for ovarian cancer is low, with no UK national screening programme women are predominantly diagnosed in the late stages (Cancer Research UK 2020), and the world is currently experiencing a pandemic of Covid-19 (WHO 2020). Creating a framework tool, using this research, will help medical researchers assess, and access, the potential of using shopping data to investigate disease.


Alan Turing Institute. 2020 [Viewed 26 February 2020] Available from: https://www.turing.ac.uk/research/research-projects/donating-personal-transactional-data-research

ALSPAC (Avon Longitudinal Study of Parents and Children) at Bristol University. 2020 [Viewed 26 February 2020] Available from: www.bristol.ac.uk/alspac/

Cancer Research UK. 2020 [Viewed 26 February 2020] Available from:  https://www.cancerresearchuk.org/about-cancer/ovarian-cancer

WHO (World Health Organisation). Coronavirus disease (COVID-19) pandemic. [Online] 2020 [Viewed 16 October 2020] Available from: https://www.who.int/emergencies/diseases/novel-coronavirus-2019 

This author is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (UKRI Grant No. EP/S023305/1).