Assessing Digital Video Data Similarity

View Sample PDF

Author(s): Waleed E. Farag (Indiana University of Pennsylvania, USA)
Copyright: 2009
Pages: 7
Source title: Encyclopedia of Multimedia Technology and Networking, Second Edition
Source Author(s)/Editor(s): Margherita Pagani (Bocconi University, Italy)
DOI: 10.4018/978-1-60566-014-1.ch012

Keywords: Information Science Reference / Media & Communications / Multimedia Technology

Purchase

View Assessing Digital Video Data Similarity on the publisher's website for pricing and purchasing information.

Abstract

Multimedia applications are rapidly spread at an everincreasing rate, introducing a number of challenging problems at the hands of the research community. The most significant and influential problem among them is the effective access to stored data. In spite of the popularity of keyword-based search technique in alphanumeric databases, it is inadequate for use with multimedia data due to their unstructured nature. On the other hand, a number of video content and contextbased access techniques have been developed (Deb, 2005). The basic idea of content-based retrieval is to access multimedia data by their contents, for example, using one of the visual content features. While context-based techniques try to improve the retrieval performance by using associated contextual information, other than those derived from the media content (Hori & Aizawa, 2003). Most of the proposed video indexing and retrieval prototypes have two major phases, the database population and the retrieval phase. In the former one, the video stream is partitioned into its constituent shots in a process known as shot boundary detection (Farag & Abdel-Wahab, 2001, 2002b). This step is followed by a process of selecting representative frames to summarize video shots (Farag & Abdel-Wahab, 2002a). Then, a number of low-level features (color, texture, object motion, etc.) are extracted in order to use them as indices to shots. The database population phase is performed as an off-line activity and it outputs a set of metadata with each element representing one of the clips in the video archive. In the retrieval phase, a query is presented to the system that in turns performs similarity matching operations and returns similar data back to the user. The basic objective of an automated video retrieval system (described above) is to provide the user with easy-to-use and effective mechanisms to access the required information. For that reason, the success of a content-based video access system is mainly measured by the effectiveness of its retrieval phase. The general query model adopted by almost all multimedia retrieval systems is the QBE (query by example; Marchionini, 2006). In this model, the user submits a query in the form of an image or a video clip (in case of a video retrieval system) and asks the system to retrieve similar data. QBE is considered to be a promising technique since it provides the user with an intuitive way of query presentation. In addition, the form of expressing a query condition is close to that of the data to be evaluated. Upon the reception of the submitted query, the retrieval stage analyzes it to extract a set of features then performs the task of similarity matching. In the latter task, the query-extracted features are compared with the features stored into the metadata; then matches are sorted and displayed back to the user based on how close a hit is to the input query. A central issue here is the assessment of video data similarity. Appropriately answering the following questions has a crucial impact on the effectiveness and applicability of the retrieval system. How are the similarity matching operations performed and based on what criteria? Do the employed similarity matching models reflect the human perception of multimedia similarity? The main focus of this article is to shed the light on possible answers to the above questions.

The IRMA Community

Research IRM

Assessing Digital Video Data Similarity

Purchase

Abstract

Related Content

IRMA Sponsors