Identification of relations between risk factors and their pathologies or health conditions by mining scientific literature. - Archive ouverte HAL Access content directly
Journal Articles Studies in Health Technology and Informatics Year : 2010

Identification of relations between risk factors and their pathologies or health conditions by mining scientific literature.

Abstract

Risk factors discovery and prevention is an active research field within the biomedical domain. Despite abundant existing information on risk factors, as found in bibliographical databases or on several websites, accessing this information may be difficult. Methods from Natural Language Processing and Information Extraction can be helpful to access it more easily. Specifically, we show a procedure for analyzing massive amounts of scientific literature and for detecting linguistically marked associations between pathologies and risk factors. This approach allowed us to extract over 22,000 risk factors and associated pathologies. The performed evaluations pointed out that (1) over 88% of risk factors for coronary heart disease are correct, (2) associated pathologies, when they could be compared to MeSH indexing, are correct in about 70%, and (3) in existing terminologies links between risk factors and their pathologies are seldom recorded.
Embargoed file
Embargoed file
Ne sera jamais visible

Dates and versions

pasteur-00606238 , version 1 (05-07-2011)

Identifiers

Cite

Thierry Hamon, Martin Graña, Víctor Raggio, Natalia Grabar, Hugo Naya. Identification of relations between risk factors and their pathologies or health conditions by mining scientific literature.. Studies in Health Technology and Informatics, 2010, 160 (Pt 2), pp.964-8. ⟨10.1111/j.1567-1364.2008.00361.x⟩. ⟨pasteur-00606238⟩
112 View
2 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More