The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Automatic Image Captioning Using Different Variants of the Long Short-Term Memory (LSTM) Deep Learning Model
|
Author(s): Ritwik Kundu (Vellore Institute of Technology, Vellore, India), Shaurya Singh (Vellore Institute of Technology, Vellore, India), Geraldine Amali (Vellore Institute of Technology, Vellore, India), Mathew Mithra Noel (Vellore Institute of Technology, Vellore, India)and Umadevi K. S. (Vellore Institute of Technology, Vellore, India)
Copyright: 2023
Pages: 24
Source title:
Deep Learning Research Applications for Natural Language Processing
Source Author(s)/Editor(s): L. Ashok Kumar (PSG College of Technology, India), Dhanaraj Karthika Renuka (PSG College of Technology, India)and S. Geetha (Vellore Institute of Technology, India)
DOI: 10.4018/978-1-6684-6001-6.ch008
Purchase
|
Abstract
Today's world is full of digital images; however, the context is unavailable most of the time. Thus, image captioning is quintessential for providing the content of an image. Besides generating accurate captions, the image captioning model must also be scalable. In this chapter, two variants of long short-term memory (LSTM), namely stacked LSTM and BiLSTM along with convolutional neural networks (CNN) have been used to implement the Encoder-Decoder model for generating captions. Bilingual evaluation understudy (BLEU) score metric is used to evaluate the performance of these two bi-layered models. From the study, it was observed that both the models were on par when it came to performance. Some resulted in low BLEU scores suggesting that the predicted caption was dissimilar to the actual caption whereas some very high BLEU scores suggested that the model was able to predict captions almost similar to human. Furthermore, it was found that the bidirectional LSTM model is more computationally intensive and requires more time to train than the stacked LSTM model owing to its complex architecture.
Related Content
Hewa Majeed Zangana, Marwan Omar.
© 2025.
28 pages.
|
Angel Justo Jones.
© 2025.
38 pages.
|
Angel Justo Jones.
© 2025.
38 pages.
|
Luay Albtosh.
© 2025.
38 pages.
|
Ngozi Tracy Aleke, Ivan Livingstone Zziwa, Kwame Opoku-Appiah.
© 2025.
26 pages.
|
Noble Antwi.
© 2025.
32 pages.
|
Soby T. Ajimon, Sachil Kumar.
© 2025.
46 pages.
|
|
|