A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

Karin Verspoor, Kevin Bretonnel Cohen, Arrick Lanfranchi, Colin Warner, Helen L Johnson, Christophe Roeder, Jinho D Choi, Christopher Funk, Yuriy Malenkiy, Miriam Eckert, Nianwen Xue, William A Baumgartner, Michael Bada, Martha Palmer, Lawrence E Hunter

BMC Bioinformatics | BMC | Published : 2012


This work was supported by NIH grants R01LM009254, R01GM083649, and R01LM008111 to Lawrence E. Hunter and in part by NIH/NCRR Colorado CTSI Grant Number UL1 RR025780. We gratefully acknowledge the important work of our syntactic annotation team, supervised by Martha Palmer: Arrick Lanfranchi, Colin Warner, Amanda Howard, Tim O'Gorman, Kevin Gould, and Michael Regan. We also greatly appreciate the assistance of Bob Leaman, David McClosky, Spence Green, and Christopher Manning with questions that arose while working with their tools, and David Weitzenkamp who assisted with the statistical analyses. We also thank the anonymous reviewers for their meaningful feedback on the manuscript.