Sparse Dense Transformer Network for Video Action Recognition

Qu, X; Zhang, Z; Xiao, W; Ran, J; Wang, G; Zhang, Zili

File(s) under permanent embargo

Sparse Dense Transformer Network for Video Action Recognition

conference contribution

posted on 2023-02-23, 00:40 authored by X Qu, Z Zhang, W Xiao, J Ran, G Wang, Zili ZhangZili Zhang

The action recognition backbone has continued to advance. The two-stream method based on Convolutional Neural Networks (CNNs) usually pays more attention to the video’s local features and ignores global information because of the limitation of Convolution kernels. Transformer based on attention mechanism is adopted to capture global information, which is inferior to CNNs in extracting local features. More features can improve video representations. Therefore, a novel two-stream Transformer model is proposed, Sparse Dense Transformer Network(SDTN), which involves (i) a Sparse pathway, operating at low frame rate, to capture spatial semantics and local features; and (ii) a Dense pathway, running at high frame rate, to abstract motion information. A new patch-based cropping approach is presented to make the model focus on the patches in the center of the frame. Furthermore, frame alignment, a method that compares the input frames of the two pathways, reduces the computational cost. Experiments show that SDTN extracts deeper spatiotemporal features through input policy of various temporal resolutions, and reaches 82.4% accuracy on Kinetics-400, outperforming the previous method by more than 1.9% accuracy.

History

Volume

13369 LNAI

Pagination

43-56

Publisher DOI

https://doi.org/10.1007/978-3-031-10986-7_4

ISSN

0302-9743

eISSN

1611-3349

ISBN-13

9783031109850

Publication classification

E1.1 Full written paper - refereed

Title of proceedings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publisher

Springer International Publishing

Series

Lecture Notes in Computer Science

Usage metrics

Keywords

Uncategorised value

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC