With the rapid expansion and utilization of the Internet and Web technologies, there is an increasing number of on-line medical journals. On-line journals pose new challenges in the areas of automated document analysis and content extraction, database citation records creation, data mining, and other document related applications. New techniques are needed to capture, classify, analyze, extract, modify, and reformat Web-based document information for computer storage, access, and processing. At the National Library of Medicine (NLM) we are developing an automated system, temporarily code-named WebMARS for Web-Based Medical Article Record System, to create citation records for the MEDLINE database. The system downloads and classifies Web document articles, parses and labels the article contents, extracts and reformats the citation information from the article, presents the entire citation to operators for reconciling (validation), and uploads the citation records to the MEDLINE database.