Mining Free Text for Structure

View Sample PDF

Author(s): Vladimir A. Kulyukin (Utah State University, USA)and Robin Burke (DePaul University, USA)
Copyright: 2003
Pages: 23
Source title: Data Mining: Opportunities and Challenges
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-59140-051-6.ch012

Keywords: Data Mining / Data Mining and Databases / Information Science Reference / Library & Information Science

Purchase

View Mining Free Text for Structure on the publisher's website for pricing and purchasing information.

Abstract

Knowledge of the structural organization of information in documents can be of significant assistance to information systems that use documents as their knowledge bases. In particular, such knowledge is of use to information retrieval systems that retrieve documents in response to user queries. This chapter presents an approach to mining free-text documents for structure that is qualitative in nature. It complements the statistical and machine-learning approaches, insomuch as the structural organization of information in documents is discovered through mining free text for content markers left behind by document writers. The ultimate objective is to find scalable data mining (DM) solutions for free-text documents in exchange for modest knowledge-engineering requirements. The problem of mining free text for structure is addressed in the context of finding structural components of files of frequently asked questions (FAQs) associated with many USENET newsgroups. The chapter describes a system that mines FAQs for structural components. The chapter concludes with an outline of possible future trends in the structural mining of free text.

The IRMA Community

Research IRM

Mining Free Text for Structure

Purchase

Abstract

Related Content

IRMA Sponsors