Extreme Low-Resolution Action Recognition with Confident Spatial-Temporal Attention Transfer
Article Ecrit par: Bai, Yucai ; Zou, Qin ; Chen, Xieyuanli ; Li, Lingxi ; Ding, Zhengming ; Chen, Long ;
Résumé: Action recognition on extreme low-resolution videos, e.g., a resolution of \(12 \times 16\) pixels, plays a vital role in far-view surveillance and privacy-preserving multimedia analysis. As low-resolution videos often only contain limited information, it is difficult for us to perform action recognition in them. Given the fact that one same action may be represented by videos in both high resolution (HR) and extreme low resolution (eLR), it is worth studying to utilize the relevant HR data to improve the eLR action recognition. In this work, we propose a novel Confident Spatial-Temporal Attention Transfer (CSTAT) for eLR action recognition. CSTAT acquires information from HR data by reducing the attention differences with a transfer-learning strategy. Besides, the confidence of the supervisory signal is also taken into consideration for a more reliable transferring process. Experimental results demonstrate that, the proposed method can effectively improve the accuracy of eLR action recognition and achieve state-of-the-art performances on \(12\times 16\) HMDB51, \(12\times 16\) Kinects-400, and \(12\times 16\) Something-Something v2.
Langue:
Anglais