The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Constrained Data Mining
Abstract
Mining a large data set can be time consuming, and without constraints, the process could generate sets or rules that are invalid or redundant. Some methods, for example clustering, are effective, but can be extremely time consuming for large data sets. As the set grows in size, the processing time grows exponentially. In other situations, without guidance via constraints, the data mining process might find morsels that have no relevance to the topic or are trivial and hence worthless. The knowledge extracted must be comprehensible to experts in the field. (Pazzani, 1997) With time-ordered data, finding things that are in reverse chronological order might produce an impossible rule. Certain actions always precede others. Some things happen together while others are mutually exclusive. Sometimes there are maximum or minimum values that can not be violated. Must the observation fit all of the requirements or just most. And how many is “most?” Constraints attenuate the amount of output (Hipp & Guntzer, 2002). By doing a first-stage constrained mining, that is, going through the data and finding records that fulfill certain requirements before the next processing stage, time can be saved and the quality of the results improved. The second stage also might contain constraints to further refine the output. Constraints help to focus the search or mining process and attenuate the computational time. This has been empirically proven to improve cluster purity. (Wagstaff & Cardie, 2000)(Hipp & Guntzer, 2002) The theory behind these results is that the constraints help guide the clustering, showing where to connect, and which ones to avoid. The application of user-provided knowledge, in the form of constraints, reduces the hypothesis space and can reduce the processing time and improve the learning quality.
Related Content
Girija Ramdas, Irfan Naufal Umar, Nurullizam Jamiat, Nurul Azni Mhd Alkasirah.
© 2024.
18 pages.
|
Natalia Riapina.
© 2024.
29 pages.
|
Xinyu Chen, Wan Ahmad Jaafar Wan Yahaya.
© 2024.
21 pages.
|
Fatema Ahmed Wali, Zahra Tammam.
© 2024.
24 pages.
|
Su Jiayuan, Jingru Zhang.
© 2024.
26 pages.
|
Pua Shiau Chen.
© 2024.
21 pages.
|
Minh Tung Tran, Thu Trinh Thi, Lan Duong Hoai.
© 2024.
23 pages.
|
|
|