June 2017  ●   Issue 31


Data linkage plan


This issue of our newsletter is dedicated to the RD-Connect work on data linkage, led by the partners at the Leiden University Medical Center (LUMC) and University Medical Center Groningen (UMCG) in the Netherlands.


The Rare Disease Data Linkage Plan
has taken off!

This year, the Rare Disease Data Linkage Plan was put into motion: selected muscular dystrophy patient registries, and molecular data related to Rett syndrome are the first to be made available as Findable, Accessible, Interoperable, and Reusable resources for humans and computers (FAIR). Other rare diseases will soon follow based on criteria that include stakeholder involvement. The plan is a blueprint for a FAIR support service for the rare disease domain.

The plan will FAIRify selected data resources, which will allow answering health-related or research questions requiring data from more than one data source, possibly across different countries. It also allows maximum flexibility in the type of questions that can be posed in the future. The plan involves disseminating the FAIRification know-how, applying and reusing computer-readable data models that use standard ontologies, and working with registry and FAIR software providers to make the FAIRification process easier for registry managers.

It is crucial for rare disease researchers to be able to combine data from data sources across nations, regions and institutes, and ranging from biobanks, patient registries, clinical databases and research databases. Maintaining a centralized warehouse at this scale, and with this kind of sensitive data, is neither feasible nor ethically or legally acceptable. Two great challenges are to provide solutions that can scale-up to be adopted by thousands of resources ‘at the source’; and solutions that facilitate cross-resource data analytics at the level of the data itself. The latter mitigates the most costly bottleneck for this community: researchers spending too much time and effort on harmonizing ambiguous data, while previous efforts cannot be reused.

To address these challenges, a ‘rare disease data linkage plan’ was accepted by the RD-Connect board. As of the start of this year, progress is boosted by an extra technical lead (David van Enckevort, UMCG), a dedicated FAIR data steward (Annika Jacobsen, LUMC), and additional Linked Data expertise (Andra Waagmeester, Micelio-Belgium, a regular trainer at Bring Your Own Data workshops). It is further kindly supported by ELIXIR-EXCELERATE and BBMRI partners, the DTL FAIR engineering team, the international FAIR skunk team, RD-Connect developers, registry software providers (partially), and stakeholders in selected disease areas. Patient organisations, such as Stichting Hevas in the Netherlands, Ring-14 Onlus and PKS Onlus in Italy, plan to invest in the collaboration. ELIXIR supports a second ‘implementation study’ to address molecular data, such as pathways and variant data. Although we have increased our effort into knowledge transfer (i.e. teach rare disease community members how to make their registries FAIR), the RD-Connect data linkage team will actively engage in making at least seven biobanks/registries FAIR.

Watch the video by the Dutch Techcentre for Life Sciences (DTL) to learn more about FAIR data and data stewardship.

FAIR Pilot for Duchenne muscular dystrophy registries

The initial step of the data linkage plan is to FAIRify a set of Duchenne muscular dystrophy (DMD) and other neuromuscular disease (NMD) patient registries. DMD is a life limiting X-linked recessive rare disease in which the affected boys lose their ability to walk due to severe muscle-wasting. There is no cure for DMD, but corticosteroid drugs, ventilatory support and cardiac medications can increase life expectancy.

The neuromuscular network, TREAT-NMD, helped to establish several national DMD patient registries, each containing a mandatory data set, covering important details about genotype, phenotype and treatment. These registries, connected through a federated model, provide detailed information needed for planning and feasibility of clinical trials. By making a set of these registries FAIR, we want to show that researchers can answer in a fast, automated way research questions such as: “Can registry data help measure the effect of steroids on the age of loss of walking ability in DMD patients?”, “Does the precise disease-causing mutation influence the age when DMD patients lose walking ability?” and “How do requirements for assisted ventilation differ between different neuromuscular diseases at different ages?”. This requires selection of ontologies (hierarchically organised standardised terminology lists) appropriate for these registries, which includes not only the Human Phenotype Ontology (HPO) and the Orphanet Rare Disease Ontology (ORDO), but other clinical outcome and drug-related ontologies. Next, ontological models will be built to answer these specific cross-resource questions. Finally, the data linkage team will develop solutions for the softwares used by the already existing registries to make their data FAIR and accessible for queries. The software and ontological models developed for the NMD use case will be shared with the community to be reused for similar cases. By making the DMD registries FAIR, disease experts can answer research questions with much less effort to aggregate and harmonise data.

Early conclusions

Dissemination of the know-how is key for scaling up data FAIRification, and even though it is time consuming in the beginning, there is a lot of interest in FAIR training activities in the rare disease community. In addition, the use cases showed that ontology-based data models (‘semantic archetypes’) need to be shared and easily reused within the community. The ELIXIR-supported biosharing.org is a prime candidate, while registry software provider Osse has indicated it could open up its case report data element repository to enable the sharing and reuse of FAIR form components. Meanwhile, as the FAIR software suite is maturing, it already allows establishing ‘FAIR data points’ for a resource easily to address Findability and Accessibility. Findability is also addressed in partnership with ID-Cards and Orphanet.

Future prospects include the improvement of standard FAIRification procedures, tools and data analytics based on FAIR data. Furthermore, to continue the data linkage plan, a dedicated organisation needs to provide a service for the rare disease community, encompassing support from at least patient organisations and the newly formed European Reference Networks. The so-called GO-FAIR implementation network seems a suitable framework. It is a mechanism to implement the ‘European Open Science Cloud’ by binding FAIR experts, data analytics experts, and domain experts with sufficient authority in a domain to set de facto standards. Training events, such as the annual summer school for rare disease registries in Rome, lead by Istituto Superiore di Sanità (ISS), are part of the service. With the aid of DTL/ELIXIR-NL work towards the organisation, a GO-FAIR implementation network has started.

Finally, the success of the plan depends on in-kind and in-case support from stakeholders. We therefore ask to express your interest by e-mailing fair-rd-info@elixir-europe.org or the RD-Connect project manager Libby Wood libby.wood@newcastle.ac.uk. We can provide details on what is required to prioritize a disease area or collaboration with a tool provider. General enquiries can also be sent to these mailing lists.

Annika Jacobsen, Leiden University Medical Center,
David van Enckevort, University Medical Center Groningen,
Marco Roos, Leiden University Medical Center


Project news


Joanna Vella receives the JRC Malta Young Scientist Award

We would like to congratulate Joanna Vella, who won the JRC Malta Young Scientist Award in the rare diseases field! Joanna is an RD-Connect partner at the University of Malta, and works at the Malta BioBank / BBMRI.mt, Centre of Molecular Medicine and Biobanking and the National Node in BBMRI-ERIC. As an awardee, she was invited to present her ongoing PhD research, Genomics in Rare Diseases, including the work at RD-Connect, to the JRC DG Vladimir Sucha and the JRC Board of Governors on the 16th June at the Joint Research Centre in Ispra.


Vacancies at the ‘big data’ hub at the University Medical Center Groningen

The group of Morris Swertz at the Genomics Coordination Center is searching for bioinformaticians to join the ambitious ‘big data’ in biomedicine research & service hub of the University Medical Center Groningen (UMCG) in the Netherlands. The openings include:

• PhD student or postdoc in Data Science in Health Bioinformatics

• Data Manager

• ICT system administrator

All positions will contribute to FAIR data and Personalized Health using exciting cohort, biobank and genome data with a.o. UMCG, BBMRI, CORBEL, LifeCycle, RD-Connect, ELIXIR.


RD-Connect platform update


The platform now contains over 2200 cases
and has a new feature update.

RD-Connect has released version 1.0 of the genomics platform.

New functionalities in the latest release:

• Added ability to add extra HPO terms for Exomiser in addition to those fetched automatically from the PhenoTips entry.

• The gene name field now provides search results in real-time as you type.

• Users can now select multiple OMIM terms and obtain the genes associated to all of them.

• Users can now select multiple HPO terms and obtain the genes associated to all of them.

For the latest updates, follow the RD-Connect platform release notes.

In addition we are also testing a new functionality for easily uploading your data and managing and sharing your own genomics samples with other groups. These functionalities are part of a broader plan for reducing the time needed to get the data available in the platform.


Featured publications


Respiratory involvement in ambulant and non-ambulant patients with facioscapulohumeral muscular dystrophy

Moreira S, Wood L, Smith D, et al., (2017)

Journal of Neurology

Facioscapulohumeral dystrophy (FHSD) is the third most common muscular dystrophy, caused usually by a genetic abnormality (shortening) of the D4Z4 repeat sequence in chromosome 4. The shorter D4Z4, the more severe the symptoms and the earlier is their onset. This study, funded by RD-Connect, aimed to understand the occurrence and predictors of respiratory impairment, which is a rare symptom among the FHSD patients, and identify predisposing factors for respiratory failure. The researchers, including the RD-Connect partners at the Newcastle University, have compared records of patients with mild or with moderate to severe symptoms. The analysis showed that severe respiratory impairment is more likely in early-onset patients with reduced exhaling capacity in spirometry, shorter D4Z4 and overall disease severity. Patients with severe respiratory involvement were also more likely to have sleep-disordered breathing. Decline of exhaling capacity over time allowed predicting how the disease would progress. Respiratory involvement for both ambulant and non-ambulant FSHD patients turned out to be more frequent and severe than previously thought. The study concludes that asymptomatic patients are at risk of acute respiratory failure and sleep-disordered breathing, a condition that can impair respiratory function. Thus, FSHD patients should undergo annual screening of the respiratory status, which is particularly important in patients with severe disease symptoms, early onset and short D4Z4.


Automated nanopublications generation from biomedical literature

Sernadela P & Oliveira JL, (2017)

Bioengineering (ENBENG)

The increasing amounts of unstructured information generated in biomedical research is a current challenge for the scientific community. To help scientists distribute and access knowledge, software engineers are developing methods for information management based on machine-readable knowledge assertions. One example of such emerging strategies are nanopublications, which aim to overcome inconsistency, ambiguity and redundancy of traditional publications. By representing relationships between research data better than traditional papers, they allow for more efficient knowledge exchange. However, the lack of extraction and publications methods makes it difficult to put into use. To address this issue, the RD-Connect partners from the Aveiro University in Portugal, propose an automated workflow for generating nanopublications from biomedical literature. The proposed method consists of exploring a tool for automated extraction of information, which allows quick detecting relevant information in published documents. Detected information is then standardised through semantic web recommendations, and is ready for further use. This study, funded by RD-Connect, can make it easier for biomedical research to find information they need and in consequence, advance research.


Design of the Familial Hypercholesterolaemia Australasia Network Registry: Creating Opportunities for Greater International Collaboration

Bellgard MI, et al., (2017)

Journal of Atherosclerosis

Familial Hypercholesterolemia (FH) is the most common monogenic disorder impairing the metabolism of lipoproteins, such as cholesterol, which leads to serious consequences including early-onset coronary heart disease. High numbers of people are expected to be affected, many remain undiagnosed. Patients with FH are often under-treated, but with early detection, cascade family testing and adequate treatment, their outcomes can improve. Patient registries are key tools for providing new information on FH and enhancing care worldwide. This study by the RD-Connect partners at the Murdoch University in Australia, present the development and design of the FH Australasia Network Registry, which provides a standardized, high-quality and cost-effective system of care aiming at improving patient outcomes. The Registry was collaboratively developed by the Australian government, patient and clinical networks and research groups. To create the Registry, they used the Rare Disease Registry Framework (RDRF), an open-source, web-based tool created within RD-Connect by the Murdoch team. RDRF was selected because of its open-source standards, modular design, interoperability, scalability and security features; which met the ever changing clinical demands across regions. This work, supported by RD-Connect, presents a model for other countries mapping out the critical features of an FH registry to meet their particular health system needs.


Summer holidays break


As the holiday season is upon us, the next issue of the RD-Connect newsletter will be released end of August.

We would like to thank our partners for the hard work through the year and we wish all of you joyful and relaxing Summer break!!!


Follow RD-Connect on social media!

The newly launched RD-Connect Facebook page creates an opportunity for RD-Connect to better engage with patient communities and other rare disease stakeholders active on Facebook.

By following RD-Connect on Twitter (@ConnectRD), you can stay up to date with breaking news regarding RD-Connect and and the rare disease research community, including conferences, workshops, events, calls and more.


Why did I get this email?

You received this email because you are associated with RD-Connect, EURenOmics or NeurOmics or because you signed up online. We will send you one email per month with news relevant to these projects and to IRDiRC. If you don't want to receive any further newsletters, you can unsubscribe below. If you're reading this online or if it was forwarded by a friend, you can sign up to future editions here.