A Biologically-Rich Approach to Identifying Pharmacogenomic Relations in Text
Brown Bag Lecture by Dr. Bastien Rance | 7/12/2011 11AM-12PM | 7th Floor Conference Room, Bldg 38A
Abstract: Objectives. In this talk I present the work we did on the extraction of pharmacogenomics relation from the biomedical literature. I define the notion of biologically-rich pharmacogenomic relation extraction and evaluate our approach against reference pharmacogenomic relations and against alternative approaches.
Methods. From a corpus of MEDLINE articles relevant to genetic variation, we identify co-occurrences between drug mentions extracted using MetaMap and RxNorm, and genetic variants extracted by EMU.
Our results are evaluated against similar reference relations curated manually in PharmGKB and against the results of an NLP-rich approach.
Results. One crucial aspect of our strategy is the use of biological knowledge for identifying specific genetic variants in text, not simply gene mentions. On the 104 reference articles from PharmGKB, the recall of our biologically-rich approach is 33%, similar to that of the NLP-rich approach (35%). Applied to the whole MEDLINE dataset, the NLP-rich approach yielded 19,978 articles, while our approach identified 4833 articles. The overlap between the two approaches is limited (224 articles).
Conclusions. We show that biologically-rich and NLP-rich approaches are complementary. Rather than a solution for the automatic curation of pharmacogenomic knowledge, we see these high-throughput approaches as tools to assist biocurators in the identification of pharmacogenomic relations of interest from the published literature.
Bio: Dr. Bastien Rance earned his PhD degree in Computer Sciences from Paris-Sud University in September 2009. He graduated from Paris-Sud University with a Master in Bioinformatics and Biostatistics in 2005. He worked one year in the Clinical Research Unit of the European Hospital George Pompidou in Paris and started his Postdoc at in the Medical Ontology Research group in September 2010. His mentor is Dr. Olivier Bodenreider from the Cognitive Sciences branch of the Lister Hill National Center for Biomedical Communications.