IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Data Mining with Incomplete Data

Data Mining with Incomplete Data
View Sample PDF
Author(s): Hai Wang (Saint Mary’s University, Canada)and Shouhong Wang (University of Massachusetts Dartmouth, USA)
Copyright: 2009
Pages: 5
Source title: Encyclopedia of Data Warehousing and Mining, Second Edition
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-60566-010-3.ch082

Purchase

View Data Mining with Incomplete Data on the publisher's website for pricing and purchasing information.

Abstract

Survey is one of the common data acquisition methods for data mining (Brin, Rastogi & Shim, 2003). In data mining one can rarely find a survey data set that contains complete entries of each observation for all of the variables. Commonly, surveys and questionnaires are often only partially completed by respondents. The possible reasons for incomplete data could be numerous, including negligence, deliberate avoidance for privacy, ambiguity of the survey question, and aversion. The extent of damage of missing data is unknown when it is virtually impossible to return the survey or questionnaires to the data source for completion, but is one of the most important parts of knowledge for data mining to discover. In fact, missing data is an important debatable issue in the knowledge engineering field (Tseng, Wang, & Lee, 2003). In mining a survey database with incomplete data, patterns of the missing data as well as the potential impacts of these missing data on the mining results constitute valuable knowledge. For instance, a data miner often wishes to know how reliable a data mining result is, if only the complete data entries are used; when and why certain types of values are often missing; what variables are correlated in terms of having missing values at the same time; what reason for incomplete data is likely, etc. These valuable pieces of knowledge can be discovered only after the missing part of the data set is fully explored.

Related Content

Girija Ramdas, Irfan Naufal Umar, Nurullizam Jamiat, Nurul Azni Mhd Alkasirah. © 2024. 18 pages.
Natalia Riapina. © 2024. 29 pages.
Xinyu Chen, Wan Ahmad Jaafar Wan Yahaya. © 2024. 21 pages.
Fatema Ahmed Wali, Zahra Tammam. © 2024. 24 pages.
Su Jiayuan, Jingru Zhang. © 2024. 26 pages.
Pua Shiau Chen. © 2024. 21 pages.
Minh Tung Tran, Thu Trinh Thi, Lan Duong Hoai. © 2024. 23 pages.
Body Bottom