Bioinformatic tools

To help researchers process, analyse and interprete data, RD-Connect has developed a number of bioinformatic tools available within the RD-Connect platform and as stand-alone tools.

RD-Connect members, often in collaboration with external partners, have developed several bioinformatic tools to help researchers analyse omics data and identify targets for potential therapies. These include variant analysis and annotation tools as well as therapeutic prediction tools and gene-drug interaction resources, many of which are integrated in the Genome-Phenome Analysis Platform.

The work on developing the bioinformatic tools within RD-Connect is led by Aix-Marseille University Medical School (AMU), France.

Bioinformatic tools developed within RD-Connect
click on tool name to see details

The tool identifies genomic variations located in non-coding DNA that may be involved in gene regulation. ALFA compiles the current state-of-art about the annotations of DNA regulatory elements (such as transcription factor binding sites, microRNA target sites) for a better understanding of biological processes underlying rare diseases.

Developed by: Interactive Biosoftware

COEUS is a software application that allows to speed up the creation of semantic web-based knowledge management systems (systems using standardized data that can be processed by machines). It allows to easily transform raw data into semantic data, e.g. to convert values (e.g. rs28897716) into concepts (e.g. gene, variant, disease), and concepts into relations (e.g. variant is_associated_to disease). In a single package, COEUS provides the tools for data management, including advanced algorithms to integrate different data sources, such as spreadsheets and databases. Currently, COEUS is being explored to standardize and link several rare diseases patient registries and biobanks, enabling distributed access.

Developed by: University of Aveiro

The crosslinkWGCNA web tool provides a user-friendly interface to the popular Weighted Gene Co-expression Network Analysis (WGCNA) R package. Users are able to upload their data sets and subsequently apply the standard workflow steps using a project-based structure and an easy to navigate interface. Along the way, the tool provides multiple familiar visualizations such as the hierarchically clustered genes with color-coded clusters, a similar tree for the samples and sample traits, module-trait correlations, and more. In addition, the tool allows for integrating multiple WGCNA projects using a correlation-based approach. Both data and visualizations can be exported.

Developed by Leiden University Medical Center.

Diseasecard is a mature collaborative portal integrating and disseminating genetic and medical information on rare diseases. Its pioneering approach provides an overview on a comprehensive rare diseases network for researchers, clinicians and bioinformatics developers. Connecting over 20 different databases (including gene, protein, disease and drug databases), Diseasecard provides direct access to the most relevant scientific knowledge on a given disease, through an interactive and easy to navigate web portal.

Developed by: University of Aveiro

This open source tool, envisaged as a web-based system, offers two main services: (1) explore – search and browse through established pharmacogenomic gene-variant-drug-metabolizer status associations; (2) translate – infer metabolizing phenotypes from individual genotype profiles for all known pharmacogenes. A machine-learning methodology (decision-tree induction) allows to induce generalized pharmacogenomic translation models from known haplotype – tables that are able to infer the metabolizer status of individuals from their genotype profiles. ePGA can be of benefit to health professionals, biomedical researchers and general public and may have a great impact in the clinic, even towards the use of a pharmacogenomics card/electronic health record.

Developed by: University of Patras

Publications: Lakiotaki et al., 2016

The tool help studying intronic and exonic mutations affecting pre-mRNA splicing signals acceptor/donor splice sites as well as branch points or auxiliary splicing signals such as Exonic Splicing Enhancers (ESE) and Exonic Splicing Silencers (ESS). The tool contains an expert system to assist the user to interpret data. It also contains new matrices to evaluate the impact of mutations on rare donor splice site motifs.

Developed by: Aix-Marseille University Medical School

Publication: Desmet FO et al., Nucleic Acids Res, 2009

This data resource and workflow helps to pass over from gene to disease using literature mining technology in a Literature-Wide Association Study (LWAS), and from disease to phenotypes and genes using the Monarch initiative database. LWAS provides access to the relevant biomedical knowledge of any particular gene-disease combination, thus moving away from black box approaches. Experts can interpret potential biological mechanisms from the provided biomedical knowledge. The machine readable format of the LWAS dataset is compliant with FAIR Data Publishing recommendations and is a part of a software (  facilitating data integration, use and reuse. Integration of the gene-disease associations datasource into the RD-Connect platform is currently underway.

Developed by: Leiden University Medical Center
Patient Archive (PA) translates deep clinical phenotyping and genome-scale biology to patient-centered human disease pathogenesis. The detailed phenotype profile created in PA, combined with data helps identify the causes of the disease, and make clear diagnosis and prognosis. Unlike other platforms, PA creates a patient clinical phenotype profile by automatic extraction of Human Phenotype Ontology (HPO) concepts from free text clinical records or the labels of uploaded images. The HPO terms are then used for patient matchmaking, disorder prediction or gene list generation. Clinical data exchange is protected with secure access control and fully integrated into the GA4GH MatchMaker Exchange. Program. Patient Archive helps to harmonize phenomic information for translational and clinical use. PA was developed as part of the Monarch Initiative.

Developed by: Garvan Institute of Medical Research

The Rare Disease Registry Framework (RDRF) is an open-source tool, which is unique in that it enables the dynamic creation of web-based patient registries with minimal software development. Utilising the RDRF, we have deployed national and international patient-driven and clinical registries including: the Myotubular and Centronuclear Myopathy Patient Registry, the Global Angelman Syndrome Registry, the Familial Hypercholesterolaemia Australasia Network Registry, and the Australian and New Zealand Neuromuscular Disorders Registries (Duchenne Muscular Dystrophy, Spinal Muscular Atrophy, and Myotonic Dystrophy Registries). The deployment of these registries involved the development of key new features, such as:

  • patient registration and log-in, enabling patient-reported data entry combined with verification by clinicians;
  • family linkage, to enable the tracking of cascade screening for familial hypercholesterolaemia;
  • automatic notifications, enabling emails to be sent from the RDRF;
  • ‘context’, which enables enhanced capture of longitudinal data through ‘multi-forms’.

New features and enhancements continue to be deployed within the RDRF, with all registries able to benefit from continued development.

Developed by: Centre for Comparative Genomics, Murdoch University

Scaleus is a data migration tool that can be used on top of traditional systems to enable semantic web features. This user-friendly tool help users easily create new semantic web applications from scratch. Targeted at the biomedical domain, this web-based platform offers, in a single package, a high-perfomance database, data integration algorithms and optimized text searches over the indexed resources. Currently, this platform is being used to support semantic-based applications for cross-resource queries over traditional rare disease resources, including biobanks (biological sample data), patient registries, genomic data and public repositories of biological relations.

Developed by: University of Aveiro

Approximately half of gene lesions responsible for human inherited diseases are due to an amino acid substitution. Distinguishing neutral sequence variations from those responsible for the phenotype is of major interest in human genetics. UMD-Predictor helps to analyse any nucleotide substitution in human cDNA and differentiate neutral ones from the pathogenic ones. This tool provides a combinatorial approach, to identify potential pathogenic variations, that associates the following data: localization within the protein, conservation, biochemical properties of the mutant and wild-type residues, and the potential impact of the variation on mRNA.

Developed by: Aix-Marseille University Medical School

Publication: Frederic MY, et al., Human Mutation, 2009 and Salgado D et al., Human Mutation, 2016

Identification of disease causing mutations from high throughput sequencing is a critical but complex process due to the large amount of data generated by these technologies. VarAFT is one of the most complete systems that includes all-in-one unique features to improve variant prioritization and is freely available for non-for-profit organizations. In addition, it provides a full graphical interface, allowing researchers and geneticists to easily annotate, filter and perform coverage analysis from Next Generation Sequencing (NGS) data without any programmatic knowledge and with limited hardware requirements.

Developed by: Aix-Marseille University Medical School