An introduction from the coordinator

Hanns Lochmüller, coordinator of RD-Connect and chair of the IRDiRC Interdisciplinary Science Committee, explains the principles behind this global project.

“Although individually uncommon, rare diseases are so numerous that they collectively affect as many as one person in every 17, or 30 million people in Europe. They span all areas of medicine and their impact on public health, society and national economies is substantial.

Their rarity and diversity pose specific challenges for healthcare provision and research, and for the development and marketing of treatments. Many patients with rare diseases lack timely and accurate diagnosis and even fewer receive tailored treatments influencing survival and quality of life.

80% of rare diseases have a genetic component, and the genomics revolution has brought the hope of gene-based treatments for many rare diseases a step closer. The first sequencing of a human genome completed in 2003 required the work of hundreds of scientists for more than 10 years at a cost of over €3 billion. The same task is now feasible on a single sequencing instrument within days at a cost of less than €10,000, and this cost is continuing to decrease.

The newly emerging ‑omics technologies are generating data on a huge scale unprecedented in biomedical research. Despite the advances in computing technology, the processing and analysis of data, or even its transfer from one location to another, is not trivial and remains far from routine.

To date several thousand complete human genomes have been sequenced. This has led to an explosion of data by several orders of magnitude in recent years, and this rapid growth is expected to continue. The limiting factor is now our ability to analyse these vast quantities of data, rather than the capacity to produce it. Bioinformatics processing costs associated with next-generation sequencing represent the highest associated costs, ranging from €10,000 to €100,000. As a consequence, new and innovative bioinformatics solutions are required.

What is also becoming increasingly evident, however, is that sequencing is only the first part of the story. It doesn’t replace clinical expertise – in fact, being able to combine genetic data with clinical data is more important than ever.

Additional complexity arises from the fact that the genome sequence of each individual differs has a few hundred thousand “private” variants that are not found in the general population. The majority of these changes are not directly disease-causing and often classed as polymorphisms, but may still be relevant for gene regulation and may modify phenotypes. Our current understanding of the underlying biology is often too limited for making appropriate predictions for an individual.

In order to advance knowledge, therefore, the combination and integration of genomics, transcriptomics, proteomics and metabolomics and detailed phenotype (phenomics) data across centres and across diseases is key. While competition between different research groups is a driving force to advance science, harmonisation and sharing of data is ultimately required to compare, combine and make best use of the results. This is especially true in rare diseases, where individuals with the conditions may be scattered across the world.

Transnational and trans-disease efforts are thus essential to make optimal use of resources. Patient registries, biobanks and bioinformatics analysis tools are the key infrastructure tools required for ‑omics research. More than 100 RD biobanks and 500 patient registries already exist in Europe alone, and progress towards infrastructure harmonisation has been made in several areas thanks to collaborative initiatives in specific disease groups (e.g. Huntington’s disease, cystic fibrosis and neuromuscular disease).

A continued bottleneck for cutting-edge research towards diagnosis and therapy development, however, is that at present these individual efforts continue to multiply while remaining largely “siloed”, with very little interoperability and almost no systematic connection of detailed clinical information (deep phenotyping) with genetic information, biomaterial availability or research/trial datasets.

To deliver concrete benefits to patients in terms of diagnosis and therapy development, the ability to link ‑omics data with clinical data and biomaterials of individual patients or well-defined patient cohorts is crucial. Outside the rare disease field, a number of major research infrastructures – IHEC, the International Human Epigenome Consortium; ICGC, the International Cancer Genome Consortium; and BBMRI, the Biobanking and Biomolecular Resources Research Infrastructure, have shown that robust tools for large-scale data and sample sharing across multiple research projects can succeed.

What RD-Connect must achieve, therefore, is both the uniting of the multiple existing infrastructures and the integration of the latest tools in order to create a robust and comprehensive combined biobanking, data analysis and patient registry platform for for rare disease that is used  by researchers across the world.”