Labeling Author Affiliations in Biomedical Articles Using Markov Model Classifiers.

Kim J, Hong S, Thoma GR
The 13th International Conference on Data Mining (DMIN2017), pp. 105-110, Las Vegas, USA, July 2017.

This paper proposes an automated labeling algorithm that extracts authors’ affiliation information (organization, city, country, etc.) from the citations in NLM’s MEDLINE® database. Researchers and granting organizations can recognize the most active research organizations or countries in specific fields by comparing the number of publications generated in each organization or country. We are developing a system to collect/show such statistics from MEDLINE. Extraction of the authors’ information from affiliations in the citations is the key step to obtaining the statistics. The proposed labeling algorithm divides an affiliation into several pieces and identifies each piece as one of seven labels (authors’ affiliation information). We adapt Stanford CoreNLP tool, Markov Model (MM), and Viterbi algorithm for the proposed algorithm. Experimental results of the proposed algorithms show 95.90% accuracy.