• Research type

    Research Database



  • Contact name

    David Snead

  • Contact email

  • Research summary

    Pathology image data Lake for Analytics, Knowledge and Exploration (PathLAKE)

  • REC name

    South Central - Oxford C Research Ethics Committee

  • REC reference


  • Date of REC Opinion

    16 Oct 2019

  • REC opinion

    Further Information Favourable Opinion

  • Data collection arrangements

    Pathologists (doctors who diagnose disease by studying tissue samples) use a microscope to examine tissue samples collected from patients. This is called light microscopy. It enables them to make a diagnosis and to give information on treatment and prognosis to clinicians (doctors). Digital pathology is a process to scan microscope slides into computer image files (whole slide images) which the pathologist can then examine on a computer screen.

    PathLAKE aims to build a research database or 'data lake' to store whole slide images of patients' tissue samples. These samples will have been taken for clinical purposes at the collaborating NHS hospitals. The images will be annotated by pathologists to mark up the areas of interest and, where possible, the images will be linked to patients' clinical data which has been pseudonymised.

    In the NHS sites where the NHS National Data Opt Out programme is being implemented, all cases with whole slide images of tissue samples will be added to the data lake unless patients have specifically opted out. For the sites where the NHS Opt Out scheme is unavailable, whole slide images and linked pseudonymised clinical data will be added to the data lake following local practice for regulatory approval.

  • Research programme

    To develop computer-assisted diagnosis, research teams need access to large numbers of whole slide images of pathology cases along with clinical and pathology data, the clinical outcome and ground truth data. Access to high quality data at such a scale is essential for the application of Artificial Intelligence (AI) in Pathology. The PathLAKE data lake is set up to acquire this data, and provide pathologist expertise to deliver ground truth data. This is vital to efficient machine learning for building algorithms. Pseudonymised clinical metadata with longitudinal updates by some NHS sites will greatly add to the value of the database. High quality databases of this type are not available in the UK. This will become one of, if not the largest, repositories of annotated whole slide images in the world. Access will be provided to UK SMEs to help them develop innovative healthcare solutions, and also to non-commercial academic, NHS and research groups nationally and globally. Access to this unique resource will drive AI innovation in the UK and ensure that the UK is in a prime position to leverage the full value of NHS pathology data to drive economic growth in health related AI.

  • Research database title

    Pathology image data Lake for Analytics, Knowledge and Exploration (PathLAKE)

  • Establishment organisation

    University Hospitals Coventry & Warwickshire NHS Trust

  • Establishment organisation address

    Clifford Bridge Road


    West Midlands

    CV2 2DX