The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Neural Semantic Video Analysis
Abstract
Videos are a rich form of data intended for capturing, storing, and communicating information. The availability of inexpensive and accessible video-capturing sensors in smartphones, handheld cameras, and consumer security cameras has exponentially increased global video footage generation over the past decade. Since video is a popular form of widely consumed and produced data, it is essential to develop automated systems to analyze and identify relevant information within the large body of video material. This chapter demonstrates how the emergence of neural networks, including CNNs and transformers, has revolutionized semantic video analysis. Through convolutional filters, spatial patterns can be captured at the pixel level through this type of neural network. The learning capability of CNN-based models has been exceeded more recently by self-attention-based models. Both CNN-based and transformer-based semantic video analysis models take advantage of transfer learning, self-supervised learning, and more to compensate for the lack of large, supervised video datasets.
Related Content
Christian Rainero, Giuseppe Modarelli.
© 2025.
26 pages.
|
Beatriz Maria Simões Ramos da Silva, Vicente Aguilar Nepomuceno de Oliveira, Jorge Magalhães.
© 2025.
21 pages.
|
Ann Armstrong, Albert J. Gale.
© 2025.
19 pages.
|
Zhi Quan, Yueyi Zhang.
© 2025.
21 pages.
|
Sanaz Adibian.
© 2025.
19 pages.
|
Le Ngoc Quang, Kulthida Tuamsuk.
© 2025.
21 pages.
|
Jorge Lima de Magalhães, Carla Cristina de Freitas da Silveira, Tatiana Aragão Figueiredo, Felipe Gilio Guzzo.
© 2025.
17 pages.
|
|
|