Anaphoric relations in the clinical narrative: corpus creation
Guergana K Savova, Wendy W Chapman, Jiaping Zheng, Rebecca S Crowley
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION | OXFORD UNIV PRESS | Published : 2011
OBJECTIVE: The long-term goal of this work is the automated discovery of anaphoric relations from the clinical narrative. The creation of a gold standard set from a cross-institutional corpus of clinical notes and high-level characteristics of that gold standard are described. METHODS: A standard methodology for annotation guideline development, gold standard annotations, and inter-annotator agreement (IAA) was used. RESULTS: The gold standard annotations resulted in 7214 markables, 5992 pairs, and 1304 chains. Each report averaged 40 anaphoric markables, 33 pairs, and seven chains. The overall IAA is high on the Mayo dataset (0.6607), and moderate on the University of Pittsburgh Medical Cen..View full abstract
Awarded by NATIONAL CANCER INSTITUTE
Awarded by NATIONAL CENTER FOR ADVANCING TRANSLATIONAL SCIENCES
The work was funded by grant R01 CA127979.