Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts
Karin M Verspoor, Go Eun Heo, Keun Young Kang, Min Song
BMC Medical Informatics and Decision Making | BIOMED CENTRAL LTD | Published : 2016
BACKGROUND: The Variome corpus, a small collection of published articles about inherited colorectal cancer, includes annotations of 11 entity types and 13 relation types related to the curation of the relationship between genetic variation and disease. Due to the richness of these annotations, the corpus provides a good testbed for evaluation of biomedical literature information extraction systems. METHODS: In this paper, we focus on assessing performance on extracting the relations in the corpus, using gold standard entities as a starting point, to establish a baseline for extraction of relations important for extraction of genetic variant information from the literature. We test the applic..View full abstract
Related Projects (1)
Awarded by Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT and Future Planning
Awarded by Australian Research Council
This research was supported by Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2012M3C4A7033342). KMV was supported by the Australian Research Council, under project DP150101550.