Multiple Sclerosis (MS) is a complex, progressive autoimmune disease that disrupts lives early and relentlessly, often beginning in young adulthood with symptoms like vision loss, numbness, and impaired coordination. Affecting nearly 2.8 million people globally[i], MS typically follows a relapsing-remitting course (RRMS) that can evolve into permanent disability over time. With 90% of untreated RRMS cases progressing to secondary MS within two decades[ii], clinical trials hinge on tracking relapse and disability—most commonly through the Expanded Disability Status Scale (EDSS), a tool developed in 1983 that remains central yet controversial in its application. The scale is currently licensed through Neurostatus, a company which is also involved in training and deployment of e-versions in collaboration with different providers.
The Expanded Disability Status Scale quantifies the disability level of MS patients by assessing signs and symptoms across eight functional systems, from vision to ambulation. Not only is it the most commonly used MS scale today, it is also the one which has been in use for the longest period of time and without much change, providing significant advantages for long-term monitoring.[iii]
Performed by an examining physician, EDSS is expected to be used for years to come in MS trials, but what challenges is it presenting for researchers today, and how can they be overcome?
Inter-rater variability
EDSS assessments are both subjective and highly complex, meaning the key issue is inter-rater variability. In 2021, a study by Cohen et al set out to evaluate this effect amongst junior neurologists (JN) and MS neurologists (MSN), who examined a group of 103 patients on the same day. Perfect agreement between the JN and MSN’s scores was met for 67% of patients, while “disagreement that could lead to a significant difference in terms of level of disability” occurred for 17% of patients.[iv]
The study also evaluated intra-rater reliability, discovering 38 discrepancies amongst the ratings of individual JNs and 14 for MSNs. It concluded that the use of less subjective and “easier-to-rate” scales should be encouraged to improve the consistency and accuracy of data collected. This could be particularly valuable for multinational trials, where researchers face challenges ensuring raters interpret and apply complex scales uniformly across languages and cultures.
Current approaches require researchers to enroll larger sample sizes to reduce the ‘noise’ in endpoint data. There is also a large emphasis on rigorous rater training which can take months to complete, slowing site activation timelines. Training must be reinforced across a rater’s career to combat the possibility of intra-rater drift, where a single rater’s scoring may change over time.
EDSS education typically involves video-based modules followed by a centralised online interactive test and certification process. When it comes to the trial itself, however, many still rely on in-person assessments and paper-based scoring, though electronic EDSS has been available since 2011.
Why it is time to move to electronic EDSS
Electronic EDSS (eEDSS) provides an “algorithm-based consistency check”, detecting combinations of scores which stray from the official rules of Neurostatus.[v] Real-time feedback is shared, enabling the rater to re-assess their score.
A study by D’Souza et al evaluated this tool across two multicentre Phase III trials. Out of more than 10,000 eEDSS assessments performed across the studies, 40.1% had inconsistencies.[vi] Following the automated feedback, this reduced to 22.1%. In total, this checking process resulted in 14.8% of the overall EDSS scores being changed, supporting the importance of programmed edit checks in increasing the reliability of EDSS scores.
In another analysis involving more than 41,800 eEDSS assessments across 13 trials, a total of 14% of submissions required expert review, with 11% requiring more than one review, and the EDSS score/step was changed by the rater in 31% of these cases.[vii] Raters were more likely to alter the step in assessments when the original EDSS value was 3.5 or lower. This highlights the subjectivity of the functional system scores upon which low EDSS steps are based. While higher disability levels are primarily determined by ambulation – which can be examined more objectively – changes at the lower end of the scale are subtle and harder to detect consistently, making these assessments particularly prone to error.
Both studies underscore the critical importance of transitioning from paper-based scoring to electronic EDSS to capture inaccuracies in MS trial data, but is there potential for future technologies to take this another step further?
Does AI have a future in EDSS?
A whitepaper authored by Fortrea, a leading global clinical research organisation (CRO), was recently published exploring AI’s growing potential to solve challenges in neurology research. In one case scenario, the whitepaper discussed the possibility for AI to facilitate sophisticated, simulated training modules for EDSS assessments.
AI could be deployed to identify out-of-range scores or scores that are inconsistent with prior scoring of other areas – similar to current practice, only faster, more accurately, and with real-time feedback that explains each error, helping to support more reliable assessments in the future. Looking to the future, EDSS platforms could even use AI to analyse video or audio recordings of patients and check assessments, flagging any scoring discrepancies. While this could be useful for assessing ambulation – the primary measure of disability in the EDSS – AI’s ability to evaluate other functional systems such as sensory or pyramidal systems is less clear, and further research and investigation is required.
To learn more about AI’s emerging potential in neuroscience research, please download the whitepaper below.
[i] Walton C, King R, Rechtman L, Kaye W, Leray E, Marrie RA, Robertson N, La Rocca N, Uitdehaag B, van der Mei I, Wallin M, Helme A, Angood Napier C, Rijke N, Baneke P. Rising prevalence of multiple sclerosis worldwide: Insights from the Atlas of MS, third edition. Mult Scler. 2020 Dec;26(14):1816-1821. doi: 10.1177/1352458520970841. Epub 2020 Nov 11. PMID: 33174475; PMCID: PMC7720355. [ii] https://www.sciencedirect.com/topics/neuroscience/multiple-sclerosis-clinical-trial [iii] Çinar BP, Yorgun YG. What We Learned from The History of Multiple Sclerosis Measurement: Expanded Disability Status Scale. Noro Psikiyatr Ars. 2018;55(Suppl 1):S69-S75. doi: 10.29399/npa.23343. PMID: 30692861; PMCID: PMC6278618. [iv] Cohen M, Bresch S, Thommel Rocchi O, Morain E, Benoit J, Levraut M, Fakir S, Landes C, Lebrun-Frénay C. Should we still only rely on EDSS to evaluate disability in multiple sclerosis patients? A study of inter and intra rater reliability. Mult Scler Relat Disord. 2021 Sep;54:103144. doi: 10.1016/j.msard.2021.103144. Epub 2021 Jul 9. PMID: 34274736. [v] https://neurostatus-uhb.com/what-is-neurostatus-eedss/ [vi] D’Souza M, Heikkilä A, Lorscheider J, et al. Electronic Neurostatus-EDSS increases the quality of expanded disability status scale assessments: Experience from two phase 3 clinical trials. Multiple Sclerosis Journal. 2019;26(8):993-996. doi:10.1177/1352458519845108 [vii] Cerdá Fuertes N, L Khurana, S Tressel Gary, E Fricker, B McDowell, L Kappos, M D’Souza. Neurostatus-eEDSS results in high consistency of Expanded Disability Status Scale assessments: Experience from 13 clinical trials. https://clario.com/wp-content/uploads/2023/09/Poster_ECTRIMS_eCOA_EDSS.pdf
