The Open-i project combines research in text processing, image analysis and machine learning to create a system (also called Open-i) that enables about 10,000 users a day to retrieve relevant images and expanded citations from the open-access biomedical literature, as well as from clinical and historic image collections. Searching may be done by text as well as image queries. Images include a wide range of clinical imaging modalities, graphs, charts, photographs and other illustrations. The images are indexed by text in captions and mentions in the article, as well as by image features. This report presents the underlying research in natural language processing, biomedical image analysis, and informatics leading to the design, development and practical implementation of this system.