Conference Proceedings

Automatic classification of sentences to support Evidence Based Medicine

Su Nam Kim, David Martinez, Lawrence Cavedon, Lars Yencken

BMC Bioinformatics | BIOMED CENTRAL LTD | Published : 2011


AIM: Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels. METHOD: We constructed a corpus of 1,000 medical abstracts annotated by hand with specified medical categories (e.g. Intervention, Outcome). We explored the use of various features based on lexical, semantic, structural, and sequential information in the data, using Conditional Random Fields (CRF) for classification. RESULTS: For the classification tasks over all labels, our systems achieved micro-averaged f-scores of 80.9% and 66.9% over datasets of structured and unstructured abstracts respectively, using sequential features...

View full abstract


Funding Acknowledgements

Eric Huang built the Annotex tool that was used for the manual annotation. NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.