Conference Proceedings

Pattern learning through distant supervision for extraction of protein-residue associations in the biomedical literature

KE Ravikumar, H Liu, JD Cohn, ME Wall, K Verspoor

IEEE | Published : 2011


We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-resid..

View full abstract