Treffer: Speeding up document image classification

Title:
Speeding up document image classification
Contributors:
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Torres Viñals, Jordi
Publisher Information:
Universitat Politècnica de Catalunya
Publication Year:
2020
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Dissertation master thesis
File Description:
application/pdf
Language:
English
Rights:
Open Access
Accession Number:
edsbas.DC7E70B5
Database:
BASE

Weitere Informationen

This work presents a solution by means of light Convolutional Neural Networks (CNNs) in the Document Classification task, essential problem in the digitalization process of institutions. We show in the RVL-CDIP dataset that we can achieve state-of-the-art results with a set of lighter models such as the EfficientNets and present its transfer learning capabilities on a smaller in-domain dataset such as Tobacco3482. Moreover, we present an ensemble pipeline which is able to boost solely image input by combining image model predictions with the ones generated by BERT model on extracted text by OCR. We also show that the batch size can be effectively increased without hindering its accuracy so that the training process can be sped up by parallelizing throughout multiple GPUs, decreasing the computational time needed.