Memory Based Temporal Fusion Network for Video Deblurring
Article Ecrit par: Wang, Chaohua ; Li, Xin ; Dong, Weisheng ; Wu, Fangfang ; Wu, Jinjian ; Shi, Guangming ;
Résumé: Video deblurring is one of the most challenging vision tasks because of the complex spatial-temporal relationship and a number of uncertainty factors involved in video acquisition. As different moving objects in the video exhibit different motion trajectories, it is difficult to accurately capture their spatial-temporal relationships. In this paper, we proposed a memory-based temporal fusion network (TFN) to capture local spatial-temporal relationships across the input sequence for video deblurring. Our temporal fusion network consists of a memory network and a temporal fusion block. The memory network stores the extracted spatial-temporal relationships and guides the temporal fusion blocks to extract local spatial-temporal relationships more accurately. In addition, in order to enable our model to more effectively fuse the multiscale features of the previous frame, we propose a multiscale and multi-hop reconstruction memory network (RMN) based on the attention mechanism and memory network. We constructed a feature extractor that integrates residual dense blocks with three downsample layers to extract hierarchical spatial features. Finally, we feed these aggregated local features into a reconstruction module to restore sharp video frames. Experimental results on public datasets show that our temporal fusion network has achieved a significant performance improvement in terms of PSNR metrics (over 1dB) over existing state-of-the-art video deblurring methods.
Langue:
Anglais