IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

An Automatic Blocking Keys Selection For Efficient Record Linkage

An Automatic Blocking Keys Selection For Efficient Record Linkage
View Sample PDF
Author(s): Hamid Naceur Benkhlaed (EEDIS Laboratory, University of Djillali Liabes, Sidi Bel Abbes, Algeria), Djamal Berrabah (EEDIS Laboratory, University of Djillali Liabes, Sidi Bel Abbes, Algeria), Nassima Dif (EEDIS Laboratory, University of Djillali Liabes, Sidi Bel Abbes, Algeria)and Faouzi Boufares (LIPN Laboratory, Paris 13 University, France)
Copyright: 2021
Volume: 11
Issue: 1
Pages: 18
Source title: International Journal of Organizational and Collective Intelligence (IJOCI)
Editor(s)-in-Chief: Victor Chang (Aston University, UK), Peng Liu (University of Kent)and Muthu Ramachandran (AI Tech and Forti5 Tech UK, United Kingdom)
DOI: 10.4018/IJOCI.2021010104

Purchase

View An Automatic Blocking Keys Selection For Efficient Record Linkage on the publisher's website for pricing and purchasing information.

Abstract

One of the important processes in the data quality field is record linkage (RL). RL (also known as entity resolution) is the process of detecting duplicates that refer to the same real-world entity in one or more datasets. The most critical step during the RL process is blocking, which reduces the quadratic complexity of the process by dividing the data into a set of blocks. By that way, matching is done only between the records in the same block. However, selecting the best blocking keys to divide the data is a hard task, and in most cases, it's done by a domain expert. In this paper, a novel unsupervised approach for an automatic blocking key selection is proposed. This approach is based on the recently proposed meta-heuristic bald eagles search (bes) optimization algorithm, where the problem is treated as a feature selection case. The obtained results from experiments on real-world datasets showed the efficiency of the proposition where the BES for feature selection outperformed existed approaches in the literature and returned the best blocking keys.

Related Content

Jing Liu, Shoubao Su, Haifeng Guo, Yuhua Lu, Yuexia Chen. © 2024. 11 pages.
Fan Liu. © 2024. 21 pages.
Kai Zhang, Zi Tang. © 2024. 21 pages.
Huijun Liang, Aokang Pang, Chenhao Lin, Jianwei Zhong. © 2024. 29 pages.
. © 2024.
Yifu Chen, Jun Li, Lin Zhang. © 2023. 31 pages.
Fazli Wahid, Rozaida Ghazali, Lokman Hakim Ismail, Ali M. Algarwi Aseere. © 2023. 13 pages.
Body Bottom