IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Cross-Modal Learning for Free-Text Video Search

Cross-Modal Learning for Free-Text Video Search
View Sample PDF
Author(s): Damianos Galanopoulos (CERTH-ITI, Greece)and Vasileios Mezaris (CERTH-ITI, Greece)
Copyright: 2025
Pages: 15
Source title: Encyclopedia of Information Science and Technology, Sixth Edition
Source Author(s)/Editor(s): Mehdi Khosrow-Pour, D.B.A. (Founding Editor-in-Chief, Information Resources Management Journal (IRMJ), USA)
DOI: 10.4018/978-1-6684-7366-5.ch088

Purchase

View Cross-Modal Learning for Free-Text Video Search on the publisher's website for pricing and purchasing information.

Abstract

This article focuses on cross-modal video retrieval, a technology with wide-ranging applications across media networks, security organizations, and even individuals managing large personal video collections. The authors discuss the concept of cross-modal video learning and offer an overview of deep neural network architectures in the literature, focusing on methods combining visual and textual representations for cross-modal video retrieval. They also examine the impact of vision transformers, a learning paradigm significantly improving cross-modal learning performance. Also, they present a novel cross-modal network architecture for free-text video retrieval called T×V+Objects. This method extends an existing state-of-the-art network by incorporating object-based video encoding using transformers. It leverages multiple latent spaces and combines detected objects with textual features, creating a joint embedding space for improved text-video similarity.

Related Content

Christian Rainero, Giuseppe Modarelli. © 2025. 26 pages.
Beatriz Maria Simões Ramos da Silva, Vicente Aguilar Nepomuceno de Oliveira, Jorge Magalhães. © 2025. 21 pages.
Ann Armstrong, Albert J. Gale. © 2025. 19 pages.
Zhi Quan, Yueyi Zhang. © 2025. 21 pages.
Sanaz Adibian. © 2025. 19 pages.
Le Ngoc Quang, Kulthida Tuamsuk. © 2025. 21 pages.
Jorge Lima de Magalhães, Carla Cristina de Freitas da Silveira, Tatiana Aragão Figueiredo, Felipe Gilio Guzzo. © 2025. 17 pages.
Body Bottom