The power of machine learning is yet to be fully explored within the life sciences industry, and there remains a question of what impact it could have in clinical trials. In this Industry Viewpoint, CTA staff writer Alexandra Bamgboye sits down with University College London’s Dr Parashkev Nachev, a professor from the school’s Institute of Neurology. Here, he explains how the industry could best leverage machine learning to its benefit.

Alexandra Bamgboye: When it comes to patient identification for clinical trials, what has the industry been doing wrong?

Parashkev Nachev: We have been trying to describe individuals based on only a small number of biological features. This has been driven partly by the statistical difficulty of handling many features, and partly by our intellectual attachment to simplicity. It does not work because most of biology is irreducibly complex, so greatly simplified descriptions have little individuating power. It is like expecting to differentiate between different people’s faces from pictures with only a few pixels in size – it cannot possibly work well.

The current approach to dealing with complexity in clinical trials is to recruit samples large and diverse enough to make the complexity resemble random noise. But that does not solve the problem; it merely conceals it from view. The result may still be highly biased, and sensitivity for real therapeutic effects will be poor where response to treatment varies substantially across individuals, as it does in focal brain injury, for example.

AB: How can machine learning overcome this?

PN: Most people think of machine learning as a tool for doing mechanically what human beings find easy to do naturally, e.g. identify photographs, transcribe spoken language and so forth. But this merely happens to be the currently dominant area of application. At its core, machine learning is really about capturing complexity in formal mathematical terms, thereby rendering it actionable in a principled, objectively-grounded way.

In the context of clinical trials, it allows us to model complex relations between various pathological and physiological parameters and clinical endpoints, allowing us better to isolate the specific effect of the intervention. For example, where there is a complex relation between the anatomical pattern of brain damage in stroke and clinical recovery, predicting the natural outcome in the absence of any treatment identifies patients who would have gotten better regardless of treatment and those too severely affected for any treatment to work. Without such prediction, and in inverse proportion to its accuracy, such cases will obscure the actual effect of treatment, making the trial much less sensitive. This is what our latest study demonstrates empirically.

Crucially, the loss in sensitivity need not be remediable by more or larger trials. Imagine an intervention that makes half of all patients much worse and the other half much better. The effect, on average, will be zero no matter how large the trial, and we shall only identify the half that benefits if we are able to identify its discriminating features. But these features need not be simple, and so may not be identifiable within a simple statistical framework at all, leaving us forever in the dark. This may be the reason so many drugs proven to work in animals have failed in humans: we may have erroneously inferred a physiological difference where the problem was actually statistical. We may be sitting on perhaps a dozen drugs that actually work!

AB: Do you see a future where the industry starts to implement this approach into their trials?

PN: There are two main barriers here: one is practical, the other intellectual. In practice, although we are accustomed to conducting large scale trials, we do not have good frameworks for phenotyping individuals with sufficient richness. In stroke, for example, some crude parameter of the brain lesion may be recorded, but not the rich anatomical pattern on which good outcome prediction critically depends.

So although we might have enough datapoints – the N, so to speak – we often do not have enough predictive features – the P – per case. Intellectually, we feel the need to keep the N/P ratio high, believing the underlying mechanisms to be fundamentally simple, and so requiring only a small number of features to describe. But this assumption of the essential parsimony of biological systems is groundless. Nothing compels biological systems to be either simple or intelligible; indeed they are under evolutionary pressure to be both diverse and opaque to potential competitors.

Neither obstacle is impossible to overcome, however, and I suspect industry will drive progress here in its pursuit of real-world impact that is increasingly hard to achieve. The U.K. could lead the world in this by exploiting the highly centralized nature of its health care system to assemble large-scale, richly phenotyped datasets. It is striking that so widely criticized a feature of the National Health Service – its top-down, soviet-style architecture—may well turn out to be the driver of the next revolution in health care.