Ontologies in rare disease registries

From a lay perspective, ontologies are essentially terminology lists in a hierarchical/tree structure, often with annotations and links out to other information.

Ontologies are a way to express concepts in standardized terms, which enables data stored in different systems to be linked up and cross-compared. The Orphanet Rare Disease Ontology (ORDO) and the Human Phenotype Ontology (HPO) are considered the most relevant ontologies to be used in the rare disease research and, more specifically, in RD-Connect. ORDO is used for “naming” diseases e.g. “autosomal recessive limb girdle muscular dystrophy type 1”, while HPO is used for describing the clinical phenotype observed in a patient e.g. “muscle weakness”.

Why are ontologies important?

It is increasingly recognized that advances in sequencing technology do not replace the need for detailed clinical analysis of patients with rare diseases. On the contrary, deep phenotyping is more important than ever to interpret whole exome and genome sequencing results. However, where clinical notes are on paper systems in hospitals or as free text in electronic systems, the computers cannot use them for analysis.

Phenotype ontologies aim to standardize the collection of phenotypic data to make them accessible to computer analysis. Phenotype definition is one of the most important and, at the same time, difficult activities in clinical practice. Doctors’ observations of patients’ symptoms are often described in imprecise ways in medical publications. Accurate standardized descriptions of phenotypic features, clinical course, laboratory data and molecular genetic findings are needed to use the already existing data as well as information yet to come. Phenotype ontologies allow to standardize signs, symptoms, classifications and complete clinical phenotypes. They are also helpful resources for checking associations between symptoms and laboratory data. In addition, phenotype ontologies allow interoperability between registries and other resources, such as biobanks or omics databases. For undiagnosed cases, ontologies enable computerized “patient matchmaking” between patients with similar symptoms, which can helps research clinicians confirm genetic diagnosis.

Read more: EURORDIS: Does Your Rare Disease Have a Code?>>

The Human Phenotype Ontology (HPO) provides a structured and controlled vocabulary for the phenotypic features (e.g. disease symptoms) encountered in human hereditary and other diseases. The HPO itself does not describe individual disease entities but, rather, the phenotypic abnormalities associated with them.

HPO is currently being developed using information from OMIM mainly, as well as from other medical information sources such as ORPHANET or DECIPHER. The combination of HPO together with the ORPHANET disease classification represents a promising resource for automated rare disease classification.

Several tools such as, Phenotips, PhenomeCentral, and Phenomizer, have been developed to help clinicians use HPO for diagnosis and other purposes:

  • Phenotips is an open source software tool for collecting and analysing phenotypic information for patients with genetic disorders. The local software is useful for clinicians or a closely related group who are managing a unique database.
  • PhenomeCentral is a centralized repository for secure data sharing targeted to clinicians and scientists working in the rare diseases community. This software allows researchers to share complex phenotypes with other researchers worldwide and to initiate contacts among them by email.
  • Phenomizer is a website resource that produces a ranked (by probabilities) list of possible diagnoses that can be used by physicians as a part of the diagnostic workup.

The Orphanet Rare Disease Ontology (ORDO) is an open-access ontology developed from the Orphanet information system, enabling complex queries of a rare disorder and its epidemiological data (age of onset, prevalence, mode of inheritance) and gene-disorder functional relationships.

Within RD-Connect, the ORDO is recommended as the primary “disease nomenclature” ontology for precise “naming” of a specific rare disease. ORDO also represents the relationship between the disorders and their genetic cause (if known), the mode of inheritance and associated epidemiological data (age of onset, age of death, prevalence).

An Evidence Code Ontology (ECO) is used to encode the origin of assertions made in the ORDO. The ORDO also provides disease cross references to the International Classification of Diseases (ICD-10), SNOMEDCT (SNOMED Clinical Terms), MeSH (Medical Subject Headings), MedDRA (Medical Dictionary for Regulatory Activities), OMIM (Online Mendelian Inheritance in Man) and UMLS (Unified Medical Language System) and genes are cross-referenced to HGNC (HUGO Gene Nomenclature Committee) UniProt (Universal Protein Resource), OMIM (Online Mendelian Inheritance in Man), Ensembl, Reactome and Genatlas.

The ORDO and Human Phenotype Ontology (HPO) developers are working on integrating both ontologies and annotating Orphanet’s phenome types with appropriate HPO terms (see also Monarch Initiative). This will provide interoperability between projects such as RD-Connect and Decipher which use the HPO and will drive the revision of the phenome hierarchy once HPO terms have been integrated.

go to ORDO website