Theses and Dissertations
Issuing Body
Mississippi State University
Advisor
Hodges, Julia
Date of Degree
12-13-2002
Original embargo terms
MSU Only Indefinitely
Document Type
Graduate Thesis - Campus Access Only
Major
Computer Science
Degree Name
Master of Science
College
College of Engineering
Department
Department of Computer Science
Abstract
This thesis describes WebDoc, an automated system that classifies Web documents according to the Library of Congress classification system. This work is an extension of an early version of the system that successfully generated indexes for journal articles. The unique features of Web documents, as well as how they will affect the design of a classification system, are discussed. We argue that full-text analysis of Web documents is inevitable, and contextual information must be used to assist the classification. The architecture of the WebDoc system is presented. We performed experiments on it with and without the assistance of contextual information. The results show that contextual information improved the system?s performance significantly.
URI
https://hdl.handle.net/11668/21259
Recommended Citation
Tang, Bo, "WebDoc an Automated Web Document Indexing System" (2002). Theses and Dissertations. 5001.
https://scholarsjunction.msstate.edu/td/5001