Watch back – #EvaluateDigiHealth Webinar 2021: How to ensure quality evidence generation for data-driven technologies?

The exponential growth of digital health products and therapeutics is engendering novel ways of producing evidence using rich sources of data.

The third webinar in the #EvaluateDigiHealth 2021 series explored evidence generation for data-driven technologies in health care, discussing with leaders in this field how companies can work with health care providers in sustainable ways to evaluate and monitor digital health products, including those leveraging artificial intelligence.

Find out more about the series and sign up for upcoming webinars here.


Jean Ledger, Research and Evaluation Lead – Digital First Primary Care, NHS England and Research Fellow, UCL


  • Jessica Morley, Policy Lead Evidence-Based Medicine DataLab, University of Oxford
  • Kassandra Karpathakis, Head of AI Strategy at the NHS AI Lab, NHSX
  • Carmelo Velardo, Senior Research Fellow, Sensyne Health

We’ve summarised the key themes and questions from the discussion:

Using clinical data for real world evaluations

Jessica is part of the team who has created OpenSAFELY, a new secure analytics platform for electronic health records in the NHS, created to deliver urgent results during the Covid-19 emergency. The platform contains 95% of all patient records as it accesses data from TPP and EMIS systems which cover 95% of GP Practices. The software is fully transparent, and anyone can use the information. This allows for real-time, real world evaluation. Often when accessing data sets which need to be extracted, this involves many approvals and the physical sending of data which means it can take from six months to over two years, by which time the data is out of date. OpenSAFELY however has a maximum one-week delay as the code is sent to the data rather than bringing the data to the code.

Carmelo highlighted, as someone with a background of academia and in industry, that the ability to access to data across the two sectors is quite different. The University of Oxford for example had very strong relationship with the nearby Trusts in terms of data sharing. He shared three main lessons from working at a digital health company and accessing data:

Healthcare systems are quite willing to collaborate with industry and researchers but the benefits to the public and the system should be clear

Security of data is of paramount importance

The health sector is complex, especially when talking about evaluation. You need to have deep understanding of the problems you are trying to solve and the environment

Sensyne does this through strategic research agreements with NHS trusts and partners. This involves establishing a clear agreement through which data can be shared safely and securely. All data is deidentified and anonymised to the highest standards.

Kassandra shared the work that NHSX is doing to promote practical approaches to real world evaluation. This includes the AI in Health and Care Award which helps evaluate technologies which meet needs of the Long-term Plan.

In regard to the outstanding question of whether the NICE Evidence Standards Framework is for AI. Kassandra shared that an exercise has commenced to look at this. Herself and Jessica have been performing an extensive literature review to understand what the standard practice for algorithm activity is and understand what a practical approach is. They will have something to share from this exercise in the next few months.

Data quality

On the topic of data quality in AI and machine learning, Jessica pointed out that health data is in fact an administrative data set. It was created as detailed notes for a doctor and wasn’t designed to create these complex systems. However, the more records are used the better the quality gets. She shared an example of AI being used to look at vaccine uptake in certain populations and the results revealed that multiple men in 70s were listed as pregnant. This was due to similar codes being used incorrectly. Another example of difficulties with data quality was that often people are coded as pregnant in the system but not coded as having had the baby which means that a women’s records might say she has been pregnant for 2/3 years. Jessica concluded that problems come when we have subsets of data and that relying on synthetic data, the creation of a modelled version of the real data, is a way to evaluate an algorithm without using the data itself.

Carmelo raised that another important aspect of data quality is access to clinical expertise, whether that be through a company clinical team or through partnerships with academic institutions for example. Collaborating with clinicians helps to identify incorrect codes (above) and clinicians can investigate and debug a system.

Advice for SMEs

Kassandra noted that a lot of people are missing the basics when it comes to evidence generation and advised that they should start at the beginning with the NICE Evidence Standards Framework. Then they can understand what they want to achieve and speak to academics. She also shared that NHSX has set up an AI Lab Virtual Hub as a community space for people to interact and share knowledge and ideas about AI technology in health and social care.

Jessica suggested that companies look to other companies that do evaluation well for example Sleepio (who featured in #EvaluateDigiHealth 2021 webinar 1). She advised companies to think of evaluation as an extended version of user testing and to build it into normal workflow. She recommended thinking about marketing claims especially if claiming to be interventional.

Carmelo shared an example in the form of their GDm-Health app. He explained that expanding the app from a single research project was difficult as they needed to expand outside remits outside of clinical trial. He flagged that help from AHSNs provided them with lots of information and understanding of how the NHS works. He also highlighted Accelerator programmes such as DigitalHealth.London’s Accelerator as beneficial for companies looking to scale.

AI for clinicians

The audience raised that while the NICE framework does not cover AI, it does say that tools aimed at doctors are the lowest risk but it was questioned as to whether this is true for AI-assisted diagnosis or prognostic predictions.

Jessica felt that they weren’t the lowest risk tools available despite there being the mitigation factor of a human being involved. She raised that clinicians could experience automation bias where they no longer question computers. Fatigue is another factor as these solutions are often deployed through pop ups in clinical systems. She highlighted the importance of considering the interaction element when designing a solution.