(Deep) Learning from Frames
Main Article Content
Abstract
Learning content from videos is not an easy task and traditional machine learning approaches for computer vision have difficulties in doing it satisfactorily. However, in the past couple of years the machine learning community has seen the rise of deep learning methods that significantly improve the accuracy of several computer vision applications, e.g., Convolutional Neural Networks (ConvNets). In this paper, we explore the suitability of ConvNets for the movie trailers genre classifi- cation problem. Assigning genres to movies is particularly challenging because genre is an immaterial feature that is not physically present in a movie frame, so off-the-shelf image detection models cannot be directly applied to this context. Hence, we propose a novel classification method that encapsulates multiple distinct ConvNets to perform genre classification, namely CoNNECT, where each ConvNet learns features that capture distinct aspects from the movie frames. We compare our novel approach with the current state-of-the-art techniques for movie classification, which make use of well-known image descriptors and lowlevel handcrafted features. Results show that CoNNECT significantly outperforms the state-of-the-art approaches in this task, moving towards effectively solving the genre classification problem.
Article Details
How to Cite
WEHRMANN, Jônatas.
(Deep) Learning from Frames.
BRACIS, [S.l.], july 2017.
Available at: <http://250154.o0gct.group/index.php/bracis/article/view/121>. Date accessed: 28 nov. 2024.
doi: https://doi.org/10.1235/bracis.vi.121.
Issue
Section
Artigos