Coreference Resolution for Structured Drug Product Labels

Kilicoglu H, Demner-Fushman D
Proceedings of BioNLP 2014. pp. 44-53.

FDA drug package inserts provide comprehensive and authoritative information about drugs. DailyMed database is a repository of structured product labels extracted from these package inserts. Most salient information about  drugs remains in free text portions of these labels. Extracting information from these portionscan improve the safety and quality of drug prescription. In this paper, we present a study that focuses on resolution of coreferential information from drug labels contained in DailyMed. We generalized and expanded an existing rule-based coreference resolution module for this purpose. Enhancements include resolution of set/instance anaphora, recognition of appositive constructions and wider use ofUMLS semantic knowledge. The results underscore the importance of set/instance anaphora and appositive constructions in this type of text and point out the shortcomings in coreference annotation in the dataset.