In my previous article, Randomized Clinical Trials, Big and Real World Data – 21st Century Studies with 21st Century Tools, I discussed the increasing role of real world data in supporting clinical development to optimize clinical trials, and specifically mentioned some of the programs that could support this, such as EMIF and EHR4CR.

The Innovative Medicine Initiative (IMI) is Europe’s largest public-private initiative, aiming to speed up the development of better and safer medicines for patients. It is a joint undertaking between the EU and the European Federation of Pharmaceutical Industries and Associations (EFPIA), and it supports collaborative research projects and builds networks of industrial and academic experts to boost pharmaceutical innovation in Europe.

IMI1 had a budget of EUR2 billion, half from the EU’s FP7 program, and half from EFPIA with in kind contributions. IMI2, with an increased budget of EUR3.3 billion, extends up until 2024.

There are two programs that have potential for a significant impact on clinical development and pharmaceutical research:

  • European Medical Information Framework (EMIF)
  • Electronic Health Records for Clinical Research (EHR4CR)

EMIF – An Ambitious Federated Data Access Network

Currently in year four of its IMI funding (2013-2017), this program has 58 partners (academic, EFPIA and Small/Medium Enterprises (SMEs)) across 14 countries and a EUR56 million budget. Janssen is the lead EFPIA partner, driving the Industry end of the partnership and strategy to realise this platform. EMIF is a complex project covering three domains:

  • EMIF-Platform: Tasked with developing the technology, governance and ethical framework to support the identification, access, evaluation of suitability and reuse of health data from a federated network of EHR and cohort data (protecting the local provenance of data sources)
  • EMIF-AD: A disease-specific research domain focusing on developing early predictive biomarkers for Alzheimer’s Disease, incorporating real world data and samples via EHR and cohort data
  • EMIF-Metabolic: A disease-specific research domain concentrating on diabetes mellitus and obesity and developing early predictive biomarkers, incorporating real world data and samples via EHR and cohort data

Both disease domains are incorporating the EMIF-Platform principle and technologies to support access to data and samples from disparate sources within the EU, while not centralizing them. The technology platform incorporates a data catalogue of available data custodians (sources) for researchers to evaluate their suitability for potential research.

As the platform portal is developed online it will incorporate a suite of multiple tools to facilitate faster identification and access, as well as providing a centralized approach to such aspects as contracting, protocol development and interaction with data custodians. One of the key tenets for EMIF is that of federated access and querying, while protecting the local provenance of the health custodian’s data. Data will be harmonised via common data models.

The industry consortium, led by Janssen, includes Amgen, Boehringer Ingelheim, GSK, Merck Serono, Novo Nordisk, Pfizer, Roche, Servier and UCB. All companies work to drive the development of a unique platform for an RWE-based research and development tool, and provide resources in kind to the IMI funding.

EMIF data custodians are diverse, including some EHR, hospital-based data, but with most in primary care, administrative data, regional record-linkage systems, registries and cohorts, and some with links to bio-banks. The cumulative number of subjects is in excess of 40 million, and approximately 20 million active patients.

Ultimately, data accessed and reused could support development as well as commercial interests, from (but not exclusively):

  • Translational research, requiring well characterized patients with potential bio-banked samples, or an opportunity for recall
  • in silico modeling of study protocols,inclusion and exclusion criteria evaluation and subject identification
  • Phase IV studies inclusive of Post Authorization Safety Studies (PASS) and/or Post Authorization Efficacy Studies (PAES), and pharmacovigilance
  • Disease specific, cohort-driven research collaborations to support discovery, research and development

Currently EMIF is working on its initial quantified business plan and sustainability model, and strategic data expansion plan with key stakeholders within and outside of the consortium, potentially collaborating with other programs and networks. It is planned to provide free wider access to bona fide researchers and research organisations of EMIF’s data catalogue, enabling scrutiny of the data sources incorporated in the very near future.

More information is available via

EHR4CR Ground-breaking Data Accesses Program

This project was funded from 2011-2014, with an unfunded extension year in 2015. Its original budget was EUR16 million, and involves 34 partners (academic, EFPIA and SMEs). Its main focus is on providing tools and services for reusing electronic health record (EHR) data for clinical research, advancing medical research, the improvement of healthcare and the enhancement of patient safety.

Essentially EHR4CR is building a central platform with a federated link (protecting local provenance of data sources) to hospital sites within the EU, based on an initial 11 hospital sites in Germany, France, Poland, Switzerland, and the U.K.

In 2016, IMI funding ended, and it will be implementing its sustainability plan and business model to ensure the platform can expand to more hospitals and hospital data, effectively “putting in the digital pipes” to generate an EU network for data reuse. Clearly, this not only needs to be supported by a technology platform, but by legal and ethical safeguards with a governance framework to support access to de-identified health data. The program was recently re-christened as the Champions Program, via a European Hospital Network.

AstraZeneca is the lead EFPIA partner, and the consortium includes Janssen, as well as Amgen, Bayer, GSK, Lilly, Merck, Novartis, Roche, and Sanofi Aventis. In collaboration, all companies drive the development of a unique platform for an EHR-derived, RWE-based research and development tool, and providing resources in kind to the IMI funding. Founding members will be in an advantageous position for when EHR4CR is post-IMI to be the first potential consumers of its outputs.

The Hospital Network is now being expanded Europe-wide and a feasibility evaluation into 2017 will provide guidance as to the success of scalability and application within clinical development optimization.

More information is available via

Large-scale Data Programs – So What?

Significant challenges remain in accessing health data generated in the real world, whether it is privacy, security and confidentiality concerns. Such challenges have been heightened post-Edward Snowden with numerous data breaches, or fragmentation of data source availability, ability to collaborate, increasing legal framework complexities and concerns on the use such data related to the perception of the Industry, to name, but a few.

A recent report from the Salford Lung Study, supported by GlaxoSmithKline, with initial results of a phase III study in a real world, primary care setting for CPOD and asthma, using an investigational combination inhaler versus standard, current clinical care has been groundbreaking. It has pointed to potential new approaches to clinical development, incorporating RCT and real world, naturalistic clinical settings, and more representative study populations. Future developments in this space inclusive of the European Medicines Agency’s Adaptive Licensing program, or Medicines Adaptive Pathway to Patients (MAPPs), also point to changes in how we conduct clinical development, and reach the market with new therapeutic agents.

We need such programmes, such as EMIF and EHR4CR outlined above, to succeed in demonstrating use cases of federated network approaches to ‘digital plumbing’ and aggregation to larger-scale population datasets that will be both wide and deep over time. Otherwise, we will increasingly be unable to answer the questions being generated by patients, governments, and academic institutions about the outcomes of the industry’s products. Optimizing clinical studies will require the ability to interact in an almost real time basis to site selection and patient enrolment.

Do continue to monitor such developments and the potential impact on the clinical trial arena.


UPDATE (Jan. 19, 2017): The EMIF Data Catalogue is now accessible beyond the EMIF project:


*Nigel Hughes is the Scientific Director, Real World Evidence, Medical Affairs, Established Products, Statistics (RMEDS) at Janssen R&D