img

Notice détaillée

Attention-guided spatial–temporal graph relation network for video-based person re-identification

Article Ecrit par: Qi, Yu ; Ge, Hongwei ; Pei, Wenbin ; Liu, Yuxuan ; Hou, Yaqing ; Sun, Liang ;

Résumé: Video-based person Re-Identification (Re-ID) is to re-identify video tracklets belonging to a specific person when reviewing this person from cross-view cameras. The key of this task is to effectively learn robust sequential feature representation under illumination changes, complicated backgrounds and viewpoint changes. Most existing methods could not explore the dynamic spatial–temporal relations of video sequences, making it impossible for them to efficiently utilize abundant spatial–temporal cues in videos. In this paper, we propose a spatial–temporal attention-guided graph relation network (STAGNet) for video-based person Re-ID. The STAGNet method has a spatial attention-guided graph relation (SAGR) module and a temporal attention-guided graph relation (TAGR) module sequentially. Specifically, in the SAGR module, we first design a multi-granularity spatial attention module to localize the parts of the pedestrian in each frame. Based on the obtained spatial attention masks, SAGR is used to capture the structured information of pedestrians. In the TAGR module, the self-attention module is used to rank images at first, and then a multi-granularity temporal graph network based on the sorted sequence is developed to mine discriminative sequence information. After optimizing these modules, our model is expected to finally obtain discriminative sequence-level features. Experiments have been conducted on three benchmarks (i.e., MARS, iLIDS-VID and PRID2011) to validate the effectiveness of the proposed method. The results indicate that the proposed method achieves state-of-the-art results


Langue: Anglais