IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Web Information Extraction via Web Views

Web Information Extraction via Web Views
View Sample PDF
Author(s): Wee Keong Ng (Nanyang Technological University, Singapore), Zehua Liu (Nanyang Technological University, Singapore), Zhao Li (Nanyang Technological University, Singapore)and Ee Peng Lim (Nanyang Technological University, Singapore)
Copyright: 2008
Pages: 28
Source title: End-User Computing: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): Steve Clarke (University of Hull Business School, UK)
DOI: 10.4018/978-1-59904-945-8.ch019

Purchase

View Web Information Extraction via Web Views on the publisher's website for pricing and purchasing information.

Abstract

With the explosion of information on the Web, traditional ways of browsing and keyword searching of information over web pages no longer satisfy the demanding needs of web surfers. Web information extraction has emerged as an important research area that aims to automatically extract information from target web pages and convert them into a structured format for further processing. The main issues involved in the extraction process include: (1) the definition of a suitable extraction language; (2) the definition of a data model representing the web information source; (3) the generation of the data model, given a target source; and (4) the extraction and presentation of information according to a given data model. In this chapter, we discuss the challenges of these issues and the approaches that current research activities have taken to revolve these issues. We propose several classification schemes to classify existing approaches of information extraction from different perspectives. Among the existing works, we focus on the Wiccap system — a software system that enables ordinary end-users to obtain information of interest in a simple and efficient manner by constructing personalized web views of information sources.

Related Content

Rod D. Roscoe, Russell J. Branaghan, Nancy J. Cooke, Scotty D. Craig. © 2018. 34 pages.
Steve Ritter, R. Charles Murray, Robert G. M. Hausmann. © 2018. 17 pages.
Yvonne S. Kao, Bryan J. Matlen, Michelle Tiu, Linlin Li. © 2018. 24 pages.
Melissa L. Stone, Kevin M. Kent, Rod D. Roscoe, Kathleen M. Corley, Laura K. Allen, Danielle S. McNamara. © 2018. 23 pages.
Elizabeth R. Kazakoff, Melissa Orkin, Kristine Bundschuh, Rachel L. Schechter. © 2018. 24 pages.
Irfan Kula, Russell J. Branaghan, Robert K. Atkinson, Rod D. Roscoe. © 2018. 17 pages.
Erin Walker, Ruth Wylie, Andreea Danielescu, James P. Rodriguez III, Ed Finn. © 2018. 19 pages.
Body Bottom