Journal article

Anaphoric relations in the clinical narrative: corpus creation

Guergana K Savova, Wendy W Chapman, Jiaping Zheng, Rebecca S Crowley

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION | OXFORD UNIV PRESS | Published : 2011

Abstract

OBJECTIVE: The long-term goal of this work is the automated discovery of anaphoric relations from the clinical narrative. The creation of a gold standard set from a cross-institutional corpus of clinical notes and high-level characteristics of that gold standard are described. METHODS: A standard methodology for annotation guideline development, gold standard annotations, and inter-annotator agreement (IAA) was used. RESULTS: The gold standard annotations resulted in 7214 markables, 5992 pairs, and 1304 chains. Each report averaged 40 anaphoric markables, 33 pairs, and seven chains. The overall IAA is high on the Mayo dataset (0.6607), and moderate on the University of Pittsburgh Medical Cen..

View full abstract