Applying Natural Language Processing to real-world patient data

  • Research type

    Research Study

  • Full title

    Applying Natural Language Processing to real-world patient data

  • IRAS ID

    328378

  • Contact name

    Gareth Price

  • Contact email

    gareth.price@manchester.ac.uk

  • Sponsor organisation

    University of Manchester

  • Duration of Study in the UK

    3 years, 3 months, 31 days

  • Research summary

    Approximately 400,000 cancer cases are diagnosed annually, resulting in over 167,000 deaths. Socioeconomic factors strongly influence incidence and mortality, with around 19,000 extra cancer deaths attributed to deprivation. Clinical decisions for cancer treatment rely on evidence primarily derived from clinical trials. However, only a small fraction of patients participate in these studies, and many patient groups such as the frail, those with multiple medical problems, and ethnic minorities, are under-represented. This means there are large sections of the population, where the available evidence might not apply, perpetuating health inequalities. Routine ‘real-world’ patient data, collected as part of routine treatment, offers an opportunity to provide evidence where clinical trial data doesn’t or won't exist.

    Artificial Intelligence approaches using real-world data can analyse patterns in cancer diagnosis and treatment, but needs data to be structured (coded) to be processed. Modern Electronic Healthcare Records (EHRs) can collect data in the required format. However, historical data and many current sources of patient information (e.g. out-patient letters, radiology reports) often exist only as free-text medical notes, and therefore needs to be coded, i.e. structured, first.

    The vision of this project is to develop the tools needed to harness the potential of these unstructured data to improve and personalise care for all cancer patients. In this non-interventional project, we will develop and apply Natural Language Processing (NLP) technologies to Head and Neck and Lung Cancer patients treated at the Christie (2014-present) to recover structured data from medical notes. We will then use these data to validate and improve models to predict cancer patients’ clinical outcomes, and to see if patients’ experience of cancer treatment agrees with clinical assessments of their outcome and might provide early warning of evolving treatment-related toxicity in head and neck cancer.

  • REC name

    North West - Haydock Research Ethics Committee

  • REC reference

    24/NW/0152

  • Date of REC Opinion

    28 Jun 2024

  • REC opinion

    Further Information Favourable Opinion