TY - JOUR
T1 - A PubMed-wide associational study of infectious diseases
AU - Sintchenko, Vitali
AU - Anthony, Stephen
AU - Phan, Xuan Hieu
AU - Lin, Frank
AU - Coiera, Enrico W.
N1 - Copyright the Author(s) 2010. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2010/3
Y1 - 2010/3
N2 - Background:Computational discovery is playing an ever-greater role in supporting the processes of knowledge synthesis. A significant proportion of the more than 18 million manuscripts indexed in the PubMed database describe infectious disease syndromes and various infectious agents. This study is the first attempt to integrate online repositories of text-based publications and microbial genome databases in order to explore the dynamics of relationships between pathogens and infectious diseases. Methodology/Principal Findings: Herein we demonstrate how the knowledge space of infectious diseases can be computationally represented and quantified, and tracked over time. The knowledge space is explored by mapping of the infectious disease literature, looking at dynamics of literature deposition, zooming in from pathogen to genome level and searching for new associations. Syndromic signatures for different pathogens can be created to enable a new and clinically focussed reclassification of the microbial world. Examples of syndrome and pathogen networks illustrate how multilevel network representations of the relationships between infectious syndromes, pathogens and pathogen genomes can illuminate unexpected biological similarities in disease pathogenesis and epidemiology. Conclusions/Significance: This new approach based on text and data mining can support the discovery of previously hidden associations between diseases and microbial pathogens, clinically relevant reclassification of pathogenic microorganisms and accelerate the translational research enterprise.
AB - Background:Computational discovery is playing an ever-greater role in supporting the processes of knowledge synthesis. A significant proportion of the more than 18 million manuscripts indexed in the PubMed database describe infectious disease syndromes and various infectious agents. This study is the first attempt to integrate online repositories of text-based publications and microbial genome databases in order to explore the dynamics of relationships between pathogens and infectious diseases. Methodology/Principal Findings: Herein we demonstrate how the knowledge space of infectious diseases can be computationally represented and quantified, and tracked over time. The knowledge space is explored by mapping of the infectious disease literature, looking at dynamics of literature deposition, zooming in from pathogen to genome level and searching for new associations. Syndromic signatures for different pathogens can be created to enable a new and clinically focussed reclassification of the microbial world. Examples of syndrome and pathogen networks illustrate how multilevel network representations of the relationships between infectious syndromes, pathogens and pathogen genomes can illuminate unexpected biological similarities in disease pathogenesis and epidemiology. Conclusions/Significance: This new approach based on text and data mining can support the discovery of previously hidden associations between diseases and microbial pathogens, clinically relevant reclassification of pathogenic microorganisms and accelerate the translational research enterprise.
UR - http://www.scopus.com/inward/record.url?scp=77949766288&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/arc/LP0667531
U2 - 10.1371/journal.pone.0009535
DO - 10.1371/journal.pone.0009535
M3 - Article
C2 - 20224767
AN - SCOPUS:77949766288
SN - 1932-6203
VL - 5
SP - 1
EP - 12
JO - PLoS ONE
JF - PLoS ONE
IS - 3
M1 - e9535
ER -