IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

A Study on Web Searching: Overlap and Distance of the Search Engine Results

A Study on Web Searching: Overlap and Distance of the Search Engine Results
View Sample PDF
Author(s): Shanfeng Chu (City University of Hong Kong, Hong Kong), Xiaotie Deng (City University of Hong Kong, Hong Kong), Qizhi Fang (Qingdao Ocean University, China)and Weimin Zhang (Tsinghua University, China)
Copyright: 2008
Pages: 12
Source title: Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-59904-951-9.ch115

Purchase

View A Study on Web Searching: Overlap and Distance of the Search Engine Results on the publisher's website for pricing and purchasing information.

Abstract

Web search engines are one of the most popular services to help users find useful information on the Web. Although many studies have been carried out to estimate the size and overlap of the general web search engines, it may not benefit the ordinary web searching users, since they care more about the overlap of the top N (N=10, 20 or 50) search results on concrete queries, but not the overlap of the total index database. In this study, we present experimental results on the comparison of the overlap of the top N (N=10, 20 or 50) search results from AlltheWeb, Google, AltaVista and WiseNut for the 58 most popular queries, as well as for the distance of the overlapped results. These 58 queries are chosen from WordTracker service, which records the most popular queries submitted to some famous metasearch engines, such as MetaCrawler and Dogpile. We divide these 58 queries into three categories for further investigation. Through in-depth study, we observe a number of interesting results: the overlap of the top N results retrieved by different search engines is very small; the search results of the queries in different categories behave in dramatically different ways; Google, on average, has the highest overlap among these four search engines; each search engine tends to adopt a different rank algorithm independently.

Related Content

Md Sakir Ahmed, Abhijit Bora. © 2024. 15 pages.
Lakshmi Haritha Medida, Kumar. © 2024. 18 pages.
Gypsy Nandi, Yadika Prasad. © 2024. 16 pages.
Saurav Bhattacharjee, Sabiha Raiyesha. © 2024. 14 pages.
Naren Kathirvel, Kathirvel Ayyaswamy, B. Santhoshi. © 2024. 26 pages.
K. Sudha, C. Balakrishnan, T. P. Anish, T. Nithya, B. Yamini, R. Siva Subramanian, M. Nalini. © 2024. 25 pages.
Sabiha Raiyesha, Papul Changmai. © 2024. 28 pages.
Body Bottom