Localizing and Recognizing Labels for Multi-Panel Figures in Biomedical Journals.

Zou J, Antani SK, Thoma GR
Proceedings of International Conference on Document Analysis and Recognition, November 13, 2017

Multi-panel figures are common in biomedical journals. Often the subpanels are of different types, e.g. x-ray, microscopy, sketch, etc. Visual information retrieval of such figures can significantly benefit from Panel Label Recognition techniques that index figures for search engines, image content tagging, and correlating with figure (sub)captions. It is a challenging task due to large variation in the label locations, sizes, contrast to background, etc. In this work, we propose a 3-stage recognition algorithm. The first stage is formulated as object detection, where we extract Histograms of Oriented Gradient (HOG) features and train a linear Support Vector Machine (SVM) classifier. Label candidates are detected using sliding windows at different locations and scales. We also trained a convolutional deep neural network (CNN) to remove false positives. The second stage is formulated as image classification. We trained a 50-class RBF SVM classifier and estimate the posterior probabilities of each candidate label. The last stage is formulated as sequence classification. We used a beam search algorithm on the posterior probabilities estimated in the second stage along with a set of label sequence constraints to select an optimal label sequence. The algorithm is trained on 9,642 figures, and evaluated on the remaining 1,000 figures shows that the proposed algorithm achieves good precision and recall.