publié le: 2023
Scene graph aims to faithfully reveal humans' perception of image content. When humans look at a scene, they usually focus on their interested parts i...
Binaural audio provides human listeners with an immersive spatial sound experience, but most existing videos lack binaural audio recordings. We propos...
Although self-attention is powerful in modeling long-range dependencies, the performance of local self-attention (LSA) is just similar to depth-wise c...
Existing out-of-distribution (OOD) detection literature clearly defines semantic shift as a sign of OOD but does not have a consensus over covariate s...
Instance-level perception tasks like object detection, instance segmentation, and 3D detection require many training samples to achieve satisfactory p...