Research on laparoscopic surgical instrument detection technology based on multi-attention-enhanced feature pyramid network
Article Ecrit par: Wang, Xinying ; li, Yang ; Zhang, Yuxuan ;
Résumé: Laparoscopic surgery is a very active area of research in clinical medicine. Detection of tools in surgical videos can help physicians operate surgical instruments, reduce complications, and ensure patient safety. However, the size of laparoscopic surgical instruments is highly variable, leading to poor detection. Feature pyramid networks (FPNs) can effectively solve the problem of multi-scale target detection, but FPNs still have some problems that limit the full utilization of multi-scale features. By analyzing the FPN design problem, we propose the Multi-Attention Augmented Feature Pyramid Network (MAFPN), which can fully utilize the multi-scale features. First, we replace the convolutional block with feature selection module (FSM) that combines channel attention and global attention, which selectively maintains important information and enhances the expressiveness of features at each scale. Second, the global contextual information is captured by the self-attentive augmented fusion module (AAFM), which enriches the high-level feature information in the FPN and enhances the feature fusion effect. Finally, we use Dynamic Convolution Decomposition (DCD) to alleviate the impact of upsampling while enhancing the feature expression ability. The experimental results on the laparoscopic surgical instrument detection dataset m2cai16-tool-locations show that the average precision of MAFPN is 96.5 when the IOU is 0.5, which is 1.8% better than the baseline method RetinaNet, and the average precision is more than 1.6% better than the comparison network. Compared with the state-of-the-art method, the performance of laparoscopic surgical instrument detection is superior.
Langue:
Anglais