---

February 2017  ●   Issue 27

---

Today is Rare Disease Day!

Every year, the last day of February is Rare Disease Day. Its aim is to raise awareness about rare diseases and their impact on patients' lives. The campaign targets the public, policy makers, public authorities, industry representatives, researchers, health professionals and anyone who has an interest in rare diseases. This year's tagline - with research, possibilities are limitless - emphasises the importance of continued research around the world towards better diagnosis and therapies. RD-Connect colleagues are proud to participate in this research and here you can see us raising hands to show solidarity with rare disease patients around the world!

---

Bioinformatic tools developed by RD-Connect partners

---

One of the key areas of work for RD-Connect has been to develop and utilise the latest tools and technologies to improve data sharing and therefore research into rare diseases. In this issue, we feature some of the bioinformatics tools developed by partners involved in RD-Connect. You can read more in the tools section of the RD-Connect website. Many of these tools are also integrated into the online platform for analysis of genomic data.

Pathogenicity prediction systems

Aix-Marseille University, France 

The identification of human mutations is critical in translational medicine as more than 8,000 rare human genetic diseases have been characterized. They constitute a primary target for drug development and more specifically genotype-based medicine. Recent high throughput sequencing technologies led to a drastic change in human genetics where the bottleneck is not anymore the data production but rather the analysis and interpretation of these data. On average, an Exome sequencing experiment produces around 70,000 Single Nucleotide Variations (SNV) and more than 3 million for a Whole Genome Sequencing (WGS) from which, in the case of a human genetic disorder, only one or two mutations are responsible for the symptoms. Most human pathogenic variations correspond to missense mutations (>50%), while UTRs (untranslated regions) account for less than 1% of variants. In addition, pathogenic intronic mutations are known to disrupt splicing signals such as branch point, donor or acceptor sites. Therefore, the challenge now mostly lies in the identification of pathogenic missense, synonymous, intron/exon boundaries and any mutation affecting splicing.

The Aix-Marseille University team has developed two pathogenicity prediction systems to facilitate the identification of such mutations, the UMD-Predictor for any substitution in human cDNA and the Human Splicing Finder for any mutation that could affect splicing signals acceptor/donor splice sites as well as branch points or auxiliary splicing signals such as Exonic Splicing Enhancers (ESE) and Exonic Splicing Silencers (ESS). The HSF now contains an expert system to assist the user to interpret data. It also contains new matrices to evaluate the impact of mutations on rare donor splice site motifs. 

VarAFT: Variant Annotation and Filtration Tool

Identification of disease causing mutations from high throughput sequencing is a critical but complex process due to the large amount of data generated by these technologies. To facilitate variant prioritization, the Aix Marseille University partner has developed a new all-in-one freely available for non-for-profit organizations system called VarAFT (Variant Annotation and Filtration Tool). As described in a recent paper from Salgado et al., it is one of the most complete system that includes unique features to improve variant prioritization. In addition, it provides a full graphical interface, allowing researchers and geneticists to easily annotate, filter and perform coverage analysis from their NGS (next generation sequencing) data without any programmatic knowledge and with limited hardware requirements. The tool has received very positive feedbacks from users.

---

Linking molecular data and clinical phenotypes

tools developed at the Leiden University Medical Center

The BioSemantics group in Leiden, the Netherlands investigates and applies methods from the field of information science to help understand the mechanisms underlying human genetic diseases and shorten the path to new treatments for (rare) diseases. Their multidisciplinary ‘enhanced science’ (e-Science) approach allows to optimally exploit the expertise of biologists, bioinformaticians, and computer scientists.

The group is developing tools that can identify molecular and clinical phenotype associations, which might be crucial for the interpretation of the role of newly reported gene candidates in causing disease or contributing to pathophysiology. The team has created data resources and workflows that help to pass over from gene to disease using literature mining technology in a Literature-Wide Association Study (LWAS), and from disease to phenotypes and genes using the Monarch initiative database.

The group’s recent publication demonstrates in a series of case studies the advantage of LWAS in providing access to the relevant biomedical knowledge of any particular gene-disease combination, thus moving away from black box approaches. Experts can interpret potential biological mechanisms from the provided biomedical knowledge. The machine readable format of the LWAS dataset is compliant with FAIR Data Publishing recommendations and is a part of a software (knowledge.bio)  facilitating data integration, use and reuse. Integration of the gene-disease associations datasource into the RD-Connect platform is currently underway.

On the molecular level, the team has created workflows to analyse and integrate different types of -omics data (e.g. metabolomic, transcriptomic, proteomic) that will help identify biomarkers to monitor disease progression and disease severity, which might improve target identification and drug discovery. In a recent publication, the team showed they can extract common disease signatures between blood and brain tissue from controls and Huntington’s disease patients, which can eventually be used as biomarkers to monitor the disease state in brain or medication efficacy. The workflows supporting this type of analysis will be linked as downstream analysis options from within the RD-Connect platform, to facilitate biomarker discovery even when tissue samples are scarce. The BioSemantics group is developing more tools that will be integrated into the platform, and will link pathways from FAIR-compliant resources such as the Wikipathways example for Huntington’s disease (see publication).

---

Alamut® Functional Annotations

Interactive Biosoftware, Rouen, France 

Alamut® Functional Annotations (ALFA) is a gene regulation prediction software tool designed to identify genomic variations located in non-coding DNA that may be involved in gene regulation. ALFA compiles the current state-of-art about the annotations of DNA regulatory elements (such as transcription factor binding sites, microRNA target sites) for a better understanding of biological processes underlying rare diseases. A new update of ALFA in the RD-Connect platform will be released in May.

Interactive Biosoftware develops software applications for human variation interpretation that allow geneticists and researchers to save time and improve variant assessment. The Alamut® Suite includes powerful bioinformatics applications for genetic analysis in the tasks of variant annotation (Alamut® Batch), filtration (Alamut® Focus), interpretation and reporting with (Alamut® Visual).

---

Patient archive

Garvan Institute of Medical Research, Sydney, Australia


Patient Archive (PA) is a platform developed by the RD-Connect partners at the Garvan Institute of Medical Research, as part of the Monarch Initiative. It translates deep clinical phenotyping and genome-scale biology to patient-centered human disease pathogenesis. The detailed phenotype profile created in PA, combined with data helps identify the causes of the disease, and make clear diagnosis and prognosis. Unlike other existing phenotyping platforms, PA creates a patient clinical phenotype profile by automatic extraction of Human Phenotype Ontology (HPO) concepts from free text clinical records or the labels of uploaded images. The HPO terms are then used for patient matchmaking, disorder prediction or gene list generation. Clinical data exchange is protected with secure access control and fully integrated into the Global Alliance for Genetic Health MatchMaker Exchange. Program. Patient Archive helps to harmonize phenomic information for translational and clinical use.

---

Electronic Pharmacogenomics Assistant

University of Patras, Greece

The electronic Pharmacogenomics Assistant (ePGA) is an open source tool envisaged as a web-based system that offers two main services: (1) explore – search and browse through established pharmacogenomic gene-variant-drug-metabolizer status associations; (2) translate – infer metabolizing phenotypes from individual genotype profiles for all known pharmacogenes.

A machine-learning methodology (decision-tree induction) allows to induce generalized pharmacogenomic translation models from known haplotype – tables that are able to infer the metabolizer status of individuals from their genotype profiles.

ePGA can be of benefit to health professionals, biomedical researchers and general public and may have a great impact in the clinic, even towards the use of a pharmacogenomics card/electronic health record. A new update of ePGA will be released in April. Read more.

---

Diseasecard: Rare Diseases Research Portal

University of Aveiro, Portugal

Diseasecard is a mature collaborative portal integrating and disseminating genetic and medical information on rare diseases. Its pioneering approach provides an overview on a comprehensive rare diseases network for researchers, clinicians and bioinformatics developers. Connecting over 20 different databases (including gene, protein, disease and drug databases), Diseasecard provides direct access to the most relevant scientific knowledge on a given disease, through an interactive and easy to navigate web portal.

---

COEUS: Semantic Web Application Framework

University of Aveiro, Portugal

COEUS is a software application that allows to speed up the creation of semantic web-based knowledge management systems (systems using standardized data that can be processed by machines). It allows to easily transform raw data into semantic data, e.g. to convert values (e.g. P51587, rs28897716, ...) into concepts (e.g. gene, variant, disease), and concepts into relations (e.g. variant is_associated_to disease). In a single package, COEUS provides the tools for data management, including advanced algorithms to integrate different data sources, such as spreadsheets and databases. Currently, COEUS is being explored to standardize and link several rare diseases patient registries and biobanks, enabling distributed access.

---

Scaleus: Semantic Web Services Integration for Biomedical Applications

University of Aveiro, Portugal

Scaleus is a data migration tool that can be used on top of traditional systems to enable semantic web features. This user-friendly tool help users easily create new semantic web applications from scratch. Targeted at the biomedical domain, this web-based platform offers, in a single package, a high-perfomance database, data integration algorithms and optimized text searches over the indexed resources. Currently, this platform is being used to support semantic-based applications for cross-resource queries over traditional rare disease resources, including biobanks (biological sample data), patient registries, genomic data and public repositories of biological relations. Read the relevant publication.

---

Rare Disease Registry Framework

Centre for Comparative Genomics, Murdoch University, Australia

The Rare Disease Registry Framework (RDRF) is an open-source tool, which is unique in that it enables the dynamic creation of web-based patient registries with minimal software development. Utilising the RDRF, we have deployed national and international patient-driven and clinical registries including: the Myotubular and Centronuclear Myopathy Patient Registry, the Global Angelman Syndrome Registry, the Familial Hypercholesterolaemia Australasia Network Registry, and the Australian and New Zealand Neuromuscular Disorders Registries (Duchenne Muscular Dystrophy, Spinal Muscular Atrophy, and Myotonic Dystrophy Registries). The deployment of these registries involved the development of key new features, such as:

• patient registration and log-in, enabling patient-reported data entry combined with verification by clinicians;

• family linkage, to enable the tracking of cascade screening for familial hypercholesterolaemia;

• automatic notifications, enabling emails to be sent from the RDRF;

• ‘context’, which enables enhanced capture of longitudinal data through ‘multi-forms’.

New features and enhancements continue to be deployed within the RDRF, with all registries able to benefit from continued development.

---

Project News

---

New features and data in the Genomics Platform

Version 0.10 of the RD-Connect Genomics Platform user-interface has been released hand-in-hand with a large new data release. The variant calling pipeline has been updated to Genome Analysis Toolkit (GATK) 3.6, and the annotation pipeline has also been updated to provide new annotations, including HGVS nomenclature. A variety of newly incorporated technical tweaks makes the platform more efficient and faster, while new tools allow to perform new types of analysis. Researchers can now filter results based on pre-calculated Runs of Homozygosity in exome data, to narrow down the regions of interest to variants falling only in such runs. It is particularly useful for consanguineous cases. Another useful new feature is the “share” button, which lets users save the URL with the results for future reference or share them with collaborators. Incorporated annotations from ClinVar will allow variants that are found in ClinVar to be rapidly identified among filtered results. Four new predefined lists of candidate genes have been added – the 59 medically actionable genes identified by the American College of Medical Genetics; 889 genes from the BabySeq project strongly indicated to cause highly penetrant childhood disorders and/or be actionable during childhood; 1158 genes for proteins that localise to the mitochondria (MitoCarta v2.0), thus representing good candidates for genes where mitochondrial dysfunction is implicated in disease, and the "Medically Interpretable Genome" of more than 5,000 genes that have been describe in the context of human diseases.

The RD-Connect Genomics Platform now includes data from 1,466 samples, including the first three datasets from the BBMRI-LPC exome sequencing project.  It is expected that at least 7 more projects will be completely processed and the data available within the next month, which will bring the number of samples in the platform to approximately 2,000. Similar progress is expected throughout 2017. The Platform is particularly rich in exomes of neuromuscular and other neurogenetic disorders, increasing the likelihood to identify a "2nd family" and validate a new disease gene.

---

The 3rd IRDiRC Conference: working towards new rare disease research goals for the next decade

The 3rd International Rare Diseases Research Consortium (IRDiRC) Conference (8-9th February, Paris) hosted around 300 participants: researchers, academics, industry leaders, policy makers and patient advocates from around the World. The conference was a unique opportunity for the attendees to reflect on the progress made in the last decade and to look forward to the challenges ahead.

Three plenary sessions opening the conference presented the IRDiRC history, providing a global view of rare disease scientific achievement, such as reaching the IRDiRC goal of 200 new therapies in 2020, already in 2017. The speakers introduced the state of foundational, diagnostics, and therapeutic rare diseases research. In their talks, RD-Connect was repeatedly mentioned as a recognized resource for rare disease research.

The following parallel sessions further investigated the state of foundational, diagnostics, and therapeutic rare diseases research as well as success stories in each space. The sessions highlighted the innovative contributions of young investigators and new approaches to rare diseases as well as explored trends in the fields of regulatory and access, patient advocacy, and companies. Patient engagement was repeatedly mentioned as crucial for initiating and catalysing the work on rare diseases – diagnosis, drug development and policy.

The closing plenary session outlined the prospects and the next set of goals for IRDiRC and the rare diseases research community, and discussed the challenges, such as the increasing costs of therapies for rare diseases. The following panel discussion, with active participation of the audience, further shaped the vision and objectives for the next decade.

---

Lucia Monaco receives the EURORDIS Scientific Award

Dr Lucia Monaco, Chief Scientific Officer at Fondazione Telethon and leader of RD-Connect's biobanking activities, has received the EURORDIS Scientific Award 2017 for cutting edge scientific developments and great impact on rare diseases.

Dr Lucia Monaco graduated in chemistry in Pavia, Italy and was trained in biochemistry in the Iowa City, USA and in molecular biology at the European Molecular Biology Laboratory in Heidelberg, Germany. As the Chief Scientific Officer at the Fondazione Telethon, she supported international rare disease research, largely through her engagement in IRDiRC.

Dr Monaco has developed Fondazione Telethon’s Rare Disease Programme and shaped the key research infrastructure for rare diseases, especially those related to biobanking and data sharing via EuroBioBank and RD-Connect.

Her enthusiasm and dedication inspires scientists and clinicians to get involved in rare diseases research. RD-Connect congratulates Lucia on her achievements and the prestigious Award!

---

RD-Connect is on Facebook!

The freshly launched RD-Connect Facebook page will complement the RD-Connect profile on Twitter (@ConnectRD). It will create an opportunity for RD-Connect to better engage with patient communities and other rare disease stakeholders active on Facebook.

Follow us on Facebook!

---

EURORDIS & RD-Connect webinar on the General Data Protection Regulation

On the 2nd February EURORDIS hosted a webinar on the legal and ethical impacts of the General Data Protection Regulation (GDPR) on data sharing in rare disease research. Dr Petra Wilson, Director at the Digital Health and Care Institute, Scotland, explains the legal impacts of the new data protection regulation on sharing data cross-border for virtual healthcare consultations in the future European Reference Networks. Next, Dr Deborah Mascalzoni, bioethicist at Uppsala University, Sweden and member of RD-Connect, focuses on the ethical aspects relating to data sharing in European research projects for rare diseases.

Watch the webinar on the EURORDIS TV channel.

---

Interview with the RD-Connect Coordinator

Horizon, the EU Research & Innovation Magazine, has just published an interview with Prof. Hanns Lochmüller, the coordinator of RD-Connect. He discusses how by requiring data generated in research projects to be made accessible for reuse, research funders can increase datasharing to accelerate diagnosis and treatment discovery for rare diseases. Read the interview here.

---

Announcements

---

Rare Diseases Registries Workshop in Madrid

The Centre for Biomedical Network Research on Rare Diseases (CIBERER) at the Instituto de Salud Carlos III is organizing a Rare Diseases Registries Workshop on the 21st-22nd March 2017 in Madrid, Spain. The workshop will bring together key stakeholders in the development of rare disease registries, who will share the results from several EU Health Programme projects and Joint Actions funded between 2008-2015. The event is supported by the Consumer, Health, Agriculture and Food Executive Agency (CHAFEA).

For additional information and registration, visit the workshop’s website.

---

ELIXIR Rare Diseases Training capacity and needs survey

A survey prepared by ELIXIR Slovenia within ELIXIR-EXCELERATE aims to collect training capacity and the needs of the rare disease community. The survey was based on the ELIXIR Training platform general survey with some specifics to Rare Diseases. Extended deadline: 8th of March 2017.

Complete the survey here.

---

Vacancy: Full Professor to lead the Department of Bioinformatics

Radboud University Medical Center (Radboudumc) has a vacancy for a professor in Bioinformatics to lead its Bioinformatics Department, known as the Center for Molecular and Biomolecular Informatics (CMBI). The new professor will lead the scientific programs of the center, nationally and internationally. The CMBI encompasses currently research lines in Comparative Genomics, Bacterial Genomics, Computational Discovery and Design, and in Protein Structure Bioinformatics. The successful applicant will also contribute to the teaching of bioinformatics in the faculties of medicine and science. The ideal candidate combines leadership skills with scientific expertise in any of the branches of bioinformatics relevant to biomedicine. Closing date: 1st of March.

For details, see the vacancy description.

---

Why did I get this email?

You received this email because you are associated with RD-Connect, EURenOmics or NeurOmics or because you signed up online. We will send you one email per month with news relevant to these projects and to IRDiRC. If you don't want to receive any further newsletters, you can unsubscribe below. If you're reading this online or if it was forwarded by a friend, you can sign up to future editions here.