Treffer: Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection : Visual Concept Detection in the MIRFLICKR/ImageCLEF Benchmark

Title:
Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection : Visual Concept Detection in the MIRFLICKR/ImageCLEF Benchmark
Source:
Computer vision and image understanding (Print). 117(5):479-492
Publisher Information:
Amsterdam: Elsevier, 2013.
Publication Year:
2013
Physical Description:
print, 59 ref
Original Material:
INIST-CNRS
Document Type:
Fachzeitschrift Article
File Description:
text
Language:
English
Author Affiliations:
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH Guildford, United Kingdom
ISSN:
1077-3142
Rights:
Copyright 2014 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.27247507
Database:
PASCAL Archive

Weitere Informationen

Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are extracted from images, they are expressed as vectors representing visual word content, referred to as mid-level features. In this paper, we review a number of techniques for generating mid-level features, including two variants of Soft Assignment, Locality-constrained Linear Coding, and Sparse Coding. We also isolate the underlying properties that affect their performance. Moreover, we investigate various pooling methods that aggregate mid-level features into vectors representing images. Average pooling, Max-pooling, and a family of likelihood inspired pooling strategies are scrutinised. We demonstrate how both coding schemes and pooling methods interact with each other. We generalise the investigated pooling methods to account for the descriptor interdependence and introduce an intuitive concept of improved pooling. We also propose a coding-related improvement to increase its speed. Lastly, state-of-the-art performance in classification is demonstrated on Caltech101, Flower17, and ImageCLEF11 datasets.