Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Design of knowledge incorporated VQA based on spatial GCNN with structured sentence embedding and linking algorithm.

Title:
Design of knowledge incorporated VQA based on spatial GCNN with structured sentence embedding and linking algorithm.
Authors:
Koshti, Dipali1 (AUTHOR) dipali.koshti@spsu.ac.in, Gupta, Ashutosh1 (AUTHOR), Kalla, Mukesh1 (AUTHOR)
Source:
Journal of Intelligent & Fuzzy Systems. 2023, Vol. 45 Issue 6, p10835-10852. 18p.
Database:
Business Source Premier

Weitere Informationen

Visual question Answering (VQA) is a computer vision task that requires a system to infer an answer to a text-based question about an image. Prior approaches did not take into account an image's positional information or the questions' grammatical and semantic relationships during image and question processing. Featurization, which leads to the false answering of the question. Hence to overcome this issue CNN –Graph based LSTM with optimized BP Featurization technique is introduced for feature extraction of image as well as question. The position of the subjects in the image has been determined using CNN with a dropout layer and the optimized momentum backpropagation during the extraction of image features without losing any image data. Then, using a graph-based LSTM with loopy backpropagation, the questions' syntactic and semantic dependencies are retrieved. However, due to their lack of external knowledge about the input image, the existing approaches are unable to respond to common sense knowledge-based questions (open domain). As a result, the proposed Spatial GCNN knowledge retrieval with PDB Model and Spatial Graph Convolutional Neural Network, which recovers external data from Wikidata, have been used to address the open domain problems. Then the Probabilistic Discriminative Bayesian model, based Attention mechanism predicts the answer by referring to all concepts in question. Thus, the proposed method answers the open domain question with high accuracy of 88.30%. [ABSTRACT FROM AUTHOR]

Copyright of Journal of Intelligent & Fuzzy Systems is the property of Sage Publications Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

Volltext ist im Gastzugang nicht verfügbar.