IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Clustering Similar Schema Elements Across Heterogeneous Databases: A First Step in Database Integration

Clustering Similar Schema Elements Across Heterogeneous Databases: A First Step in Database Integration
View Sample PDF
Author(s): Huimin Zhao (University of Wisconsin-Milwaukee, USA)and Sudha Ram (University of Arizona, USA)
Copyright: 2006
Pages: 22
Source title: Advanced Topics in Database Research, Volume 5
Source Author(s)/Editor(s): Keng Siau (Singapore Management University, Singapore)
DOI: 10.4018/978-1-59140-935-9.ch013

Purchase

View Clustering Similar Schema Elements Across Heterogeneous Databases: A First Step in Database Integration on the publisher's website for pricing and purchasing information.

Abstract

Interschema relationship identification (IRI), that is, determining the relationships among schema elements in heterogeneous data sources, is an important first step in integrating the data sources. This chapter proposes a cluster analysis-based approach to semi-automating the IRI process, which is typically very time-consuming and requires extensive human interaction. We apply multiple clustering techniques, including K-means, hierarchical clustering, and self-organizing map (SOM) neural network, to identify similar schema elements from heterogeneous data sources, based on multiple types of features, such as naming similarity, document similarity, schema specification, data patterns, and usage patterns. We describe an SOM prototype we have developed that provides users with a visualization tool for displaying clustering results and for incremental evaluation of potentially similar elements. We also report on some empirical results demonstrating the utility of the proposed approach.

Related Content

Renjith V. Ravi, Mangesh M. Ghonge, P. Febina Beevi, Rafael Kunst. © 2022. 24 pages.
Manimaran A., Chandramohan Dhasarathan, Arulkumar N., Naveen Kumar N.. © 2022. 20 pages.
Ram Singh, Rohit Bansal, Sachin Chauhan. © 2022. 19 pages.
Subhodeep Mukherjee, Manish Mohan Baral, Venkataiah Chittipaka. © 2022. 17 pages.
Vladimir Nikolaevich Kustov, Ekaterina Sergeevna Selanteva. © 2022. 23 pages.
Krati Reja, Gaurav Choudhary, Shishir Kumar Shandilya, Durgesh M. Sharma, Ashish K. Sharma. © 2022. 18 pages.
Nwosu Anthony Ugochukwu, S. B. Goyal. © 2022. 23 pages.
Body Bottom