Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

A Primer on Text-Data Analysis

A Primer on Text-Data Analysis
View Sample PDF
Author(s): Imad Rahal (College of Saint Benedict and Saint John’s University, USA), Baoying Wang (Waynesburg College, USA)and James Schnepf (College of Saint Benedict and Saint John’s University, USA)
Copyright: 2009
Pages: 8
Source title: Encyclopedia of Information Science and Technology, Second Edition
Source Author(s)/Editor(s): Mehdi Khosrow-Pour, D.B.A. (Information Resources Management Association, USA)
DOI: 10.4018/978-1-60566-026-4.ch496


View A Primer on Text-Data Analysis on the publisher's website for pricing and purchasing information.


Since the invention of the printing press, text has been the predominate mode for collecting, storing and disseminating a vast, rich range of information. With the unprecedented increase of electronic storage and dissemination, document collections have grown rapidly, increasing the need to manage and analyze this form of data in spite of its unstructured or semistructured form. Text-data analysis (Hearst, 1999) has emerged as an interdisciplinary research area forming a junction of a number of older fields like machine learning, natural language processing, and information retrieval (Grobelnik, Mladenic, & Milic-Frayling, 2000). It is sometimes viewed as an adapted form of a very similar research field that has also emerged recently, namely, data mining, which focuses primarily on structured data mostly represented in relational tables or multidimensional cubes. This article provides an overview of the various research directions in text-data analysis. After the “Introduction,” the “Background” section provides a description of a ubiquitous text-data representation model along with preprocessing steps employed for achieving better text-data representations and applications. The focal section, “Text-Data Analysis,” presents a detailed treatment of various text-data analysis subprocesses such as information extraction, information retrieval and information filtering, document clustering and document categorization. The article closes with a “Future Trends” section followed by a “Conclusion” section.

Related Content

Christine Kosmopoulos. © 2022. 22 pages.
Melkamu Beyene, Solomon Mekonnen Tekle, Daniel Gelaw Alemneh. © 2022. 21 pages.
Rajkumari Sofia Devi, Ch. Ibohal Singh. © 2022. 21 pages.
Ida Fajar Priyanto. © 2022. 16 pages.
Murtala Ismail Adakawa. © 2022. 27 pages.
Shimelis Getu Assefa. © 2022. 17 pages.
Angela Y. Ford, Daniel Gelaw Alemneh. © 2022. 22 pages.
Body Bottom