IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Realistic Data for Testing Rule Mining Algorithms

Realistic Data for Testing Rule Mining Algorithms
View Sample PDF
Author(s): Colin Cooper (Kings’ College, UK)and Michele Zito (University of Liverpool, UK)
Copyright: 2009
Pages: 6
Source title: Encyclopedia of Data Warehousing and Mining, Second Edition
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-60566-010-3.ch252

Purchase

View Realistic Data for Testing Rule Mining Algorithms on the publisher's website for pricing and purchasing information.

Abstract

The association rule mining (ARM) problem is a wellestablished topic in the field of knowledge discovery in databases. The problem addressed by ARM is to identify a set of relations (associations) in a binary valued attribute set which describe the likely coexistence of groups of attributes. To this end it is first necessary to identify sets of items that occur frequently, i.e. those subsets F of the available set of attributes I for which the support (the number of times F occurs in the dataset under consideration), exceeds some threshold value. Other criteria are then applied to these item-sets to generate a set of association rules, i.e. relations of the form A ? B, where A and B represent disjoint subsets of a frequent item-set F such that A ? B = F. A vast array of algorithms and techniques has been developed to solve the ARM problem. The algorithms of Agrawal & Srikant (1994), Bajardo (1998), Brin, et al. (1997), Han et al. (2000), and Toivonen (1996), are only some of the best-known heuristics. There has been recent growing interest in the class of so-called heavy tail statistical distributions. Distributions of this kind had been used in the past to describe word frequencies in text (Zipf, 1949), the distribution of animal species (Yule, 1925), of income (Mandelbrot, 1960), scientific citations count (Redner, 1998) and many other phenomena. They have been used recently to model various statistics of the web and other complex networks Science (Barabasi & Albert, 1999; Faloutsos et al., 1999; Steyvers & Tenenbaum, 2005).

Related Content

Girija Ramdas, Irfan Naufal Umar, Nurullizam Jamiat, Nurul Azni Mhd Alkasirah. © 2024. 18 pages.
Natalia Riapina. © 2024. 29 pages.
Xinyu Chen, Wan Ahmad Jaafar Wan Yahaya. © 2024. 21 pages.
Fatema Ahmed Wali, Zahra Tammam. © 2024. 24 pages.
Su Jiayuan, Jingru Zhang. © 2024. 26 pages.
Pua Shiau Chen. © 2024. 21 pages.
Minh Tung Tran, Thu Trinh Thi, Lan Duong Hoai. © 2024. 23 pages.
Body Bottom