LET.Net
locally enhanced transformer network for medical image segmentation
Article Ecrit par: Ta, Na ; Chen, Haipeng ; Liu, Xianzhu ; Jin, Nuo ;
Résumé: Medical image segmentation has attracted increasing attention due to its practical clinical requirements. However, the prevalence of small targets still poses great challenges for accurate segmentation. In this paper, we propose a novel locally enhanced transformer network (LET-Net) that combines the strengths of transformer and convolution to address this issue. LET-Net utilizes a pyramid vision transformer as its encoder and is further equipped with two novel modules to learn more powerful feature representation. Specifically, we design a feature-aligned local enhancement module, which encourages discriminative local feature learning on the condition of adjacent-level feature alignment. Moreover, to effectively recover high-resolution spatial information, we apply a newly designed progressive local-induced decoder. This decoder contains three cascaded local reconstruction and refinement modules that dynamically guide the upsampling of high-level features by their adaptive reconstruction kernels and further enhance feature representation through a split-attention mechanism. Additionally, to address the severe pixel imbalance for small targets, we design a mutual information loss that maximizes task-relevant information while eliminating task-irrelevant noises. Experimental results demonstrate that our LET-Net provides more effective support for small target segmentation and achieves state-of-the-art performance in polyp and breast lesion segmentation tasks.
Langue:
Anglais