Three minimal sequences found in Ebola virus genomes and absent from human DNA

Silva RM, Pratas D, Castro L, Pinho AJ, Ferreira PJ

Bioinformatics 31(15):2421-5, April 2015
DOI: 10.1093/bioinformatics/btv189


Motivation: Ebola virus causes high mortality hemorrhagic fevers, with more than 25 000 cases and 10 000 deaths in the current outbreak. Only experimental therapies are available, thus, novel diagnosis tools and druggable targets are needed.

Results: Analysis of Ebola virus genomes from the current outbreak reveals the presence of short DNA sequences that appear nowhere in the human genome. We identify the shortest such sequences with lengths between 12 and 14. Only three absent sequences of length 12 exist and they consistently appear at the same location on two of the Ebola virus proteins, in all Ebola virus genomes, but nowhere in the human genome. The alignment-free method used is able to identify pathogen-specific signatures for quick and precise action against infectious agents, of which the current Ebola virus outbreak provides a compelling example.

Lay summary


Further info

Click here to view the publication at the journal website