IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Learning Information Extraction Rules for Web Data Mining

Learning Information Extraction Rules for Web Data Mining
View Sample PDF
Author(s): Chia-Hui Chang (National Central University, Taiwan)and Chun-Nan Hsu (Institute of Information Science, Academia Sinica, Taiwan)
Copyright: 2005
Pages: 6
Source title: Encyclopedia of Data Warehousing and Mining
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-59140-557-3.ch129

Purchase

View Learning Information Extraction Rules for Web Data Mining on the publisher's website for pricing and purchasing information.

Abstract

The explosive growth and popularity of the World Wide Web has resulted in a huge number of information sources on the Internet. However, due to the heterogeneity and the lack of structure of Web information sources, access to this huge collection of information has been limited to browsing and keyword searching. Sophisticated Web-mining applications, such as comparison shopping, require expensive maintenance costs to deal with different data formats. The problem in translating the contents of input documents into structured data is called information extraction (IE). Unlike information retrieval (IR), which concerns how to identify relevant documents from a document collection, IE produces structured data ready for post-processing, which is crucial to many applications of Web mining and search tools.

Related Content

Md Sakir Ahmed, Abhijit Bora. © 2024. 15 pages.
Lakshmi Haritha Medida, Kumar. © 2024. 18 pages.
Gypsy Nandi, Yadika Prasad. © 2024. 16 pages.
Saurav Bhattacharjee, Sabiha Raiyesha. © 2024. 14 pages.
Naren Kathirvel, Kathirvel Ayyaswamy, B. Santhoshi. © 2024. 26 pages.
K. Sudha, C. Balakrishnan, T. P. Anish, T. Nithya, B. Yamini, R. Siva Subramanian, M. Nalini. © 2024. 25 pages.
Sabiha Raiyesha, Papul Changmai. © 2024. 28 pages.
Body Bottom