IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

An Overview of Multimodal Interaction Techniques and Applications

An Overview of Multimodal Interaction Techniques and Applications
View Sample PDF
Author(s): Marie-Luce Bourguet (Queen Mary University of London, UK)
Copyright: 2008
Pages: 7
Source title: Intelligent Information Technologies: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): Vijayan Sugumaran (Oakland University, Rochester, USA)
DOI: 10.4018/978-1-59904-941-0.ch001

Purchase

View An Overview of Multimodal Interaction Techniques and Applications on the publisher's website for pricing and purchasing information.

Abstract

Desktop multimedia (multimedia personal computers) dates from the early 1970s. At that time, the enabling force behind multimedia was the emergence of the new digital technologies in the form of digital text, sound, animation, photography, and, more recently, video. Nowadays, multimedia systems mostly are concerned with the compression and transmission of data over networks, large capacity and miniaturized storage devices, and quality of services; however, what fundamentally characterizes a multimedia application is that it does not understand the data (sound, graphics, video, etc.) that it manipulates. In contrast, intelligent multimedia systems at the crossing of the artificial intelligence and multimedia disciplines gradually have gained the ability to understand, interpret, and generate data with respect to content. Multimodal interfaces are a class of intelligent multimedia systems that make use of multiple and natural means of communication (modalities), such as speech, handwriting, gestures, and gaze, to support human-machine interaction. More specifically, the term modality describes human perception on one of the three following perception channels: visual, auditive, and tactile. Multimodality qualifies interactions that comprise more than one modality on either the input (from the human to the machine) or the output (from the machine to the human) and the use of more than one device on either side (e.g., microphone, camera, display, keyboard, mouse, pen, track ball, data glove). Some of the technologies used for implementing multimodal interaction come from speech processing and computer vision; for example, speech recognition, gaze tracking, recognition of facial expressions and gestures, perception of sounds for localization purposes, lip movement analysis (to improve speech recognition), and integration of speech and gesture information. In 1980, the put-that-there system (Bolt, 1980) was developed at the Massachusetts Institute of Technology and was one of the first multimodal systems. In this system, users simultaneously could speak and point at a large-screen graphics display surface in order to manipulate simple shapes. In the 1990s, multimodal interfaces started to depart from the rather simple speech-and-point paradigm to integrate more powerful modalities such as pen gestures and handwriting input (Vo, 1996) or haptic output. Currently, multimodal interfaces have started to understand 3D hand gestures, body postures, and facial expressions (Ko, 2003), thanks to recent progress in computer vision techniques.

Related Content

Kamel Mouloudj, Vu Lan Oanh LE, Achouak Bouarar, Ahmed Chemseddine Bouarar, Dachel Martínez Asanza, Mayuri Srivastava. © 2024. 20 pages.
José Eduardo Aleixo, José Luís Reis, Sandrina Francisca Teixeira, Ana Pinto de Lima. © 2024. 52 pages.
Jorge Figueiredo, Isabel Oliveira, Sérgio Silva, Margarida Pocinho, António Cardoso, Manuel Pereira. © 2024. 24 pages.
Fatih Pinarbasi. © 2024. 20 pages.
Stavros Kaperonis. © 2024. 25 pages.
Thomas Rui Mendes, Ana Cristina Antunes. © 2024. 24 pages.
Nuno Geada. © 2024. 12 pages.
Body Bottom