The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Flexible Mining of Association Rules
Abstract
The discovery of association rules showing conditions of data co-occurrence has attracted the most attention in data mining. An example of an association rule is the rule “the customer who bought bread and butter also bought milk,” expressed by T(bread; butter)? T(milk). Let I ={x1,x2,…,xm} be a set of (data) items, called the domain; let D be a collection of records (transactions), where each record, T, has a unique identifier and contains a subset of items in I. We define itemset to be a set of items drawn from I and denote an itemset containing k items to be k-itemset. The support of itemset X, denoted by Ã(X/D), is the ratio of the number of records (in D) containing X to the total number of records in D. An association rule is an implication rule ?Y, where X; ? I and X ?Y=0. The confidence of ? Y is the ratio of s(?Y/D) to s(X/D), indicating that the percentage of those containing X also contain Y. Based on the user-specified minimum support (minsup) and confidence (minconf), the following statements are true: An itemset X is frequent if s(X/D)> minsup, and an association rule ? XY is strong i ?XY is frequent and ( / ) ( / ) X Y D X Y ? ¸ minconf. The problem of mining association rules is to find all strong association rules, which can be divided into two subproblems: 1. Find all the frequent itemsets. 2. Generate all strong rules from all frequent itemsets. Because the second subproblem is relatively straightforward ? we can solve it by extracting every subset from an itemset and examining the ratio of its support; most of the previous studies (Agrawal, Imielinski, & Swami, 1993; Agrawal, Mannila, Srikant, Toivonen, & Verkamo, 1996; Park, Chen, & Yu, 1995; Savasere, Omiecinski, & Navathe, 1995) emphasized on developing efficient algorithms for the first subproblem. This article introduces two important techniques for association rule mining: (a) finding N most frequent itemsets and (b) mining multiple-level association rules.
Related Content
Girija Ramdas, Irfan Naufal Umar, Nurullizam Jamiat, Nurul Azni Mhd Alkasirah.
© 2024.
18 pages.
|
Natalia Riapina.
© 2024.
29 pages.
|
Xinyu Chen, Wan Ahmad Jaafar Wan Yahaya.
© 2024.
21 pages.
|
Fatema Ahmed Wali, Zahra Tammam.
© 2024.
24 pages.
|
Su Jiayuan, Zhang Jingru.
© 2024.
26 pages.
|
Pua Shiau Chen.
© 2024.
21 pages.
|
Minh Tung Tran, Thu Trinh Thi, Lan Duong Hoai.
© 2024.
23 pages.
|
|
|