Presentation + Paper
3 April 2024 W-MAFormer: W-shaped multi-attention assisted transformer for polyp segmentation
Author Affiliations +
Abstract
Colorectal cancer, ranked as the third deadliest disease globally, can be effectively prevented through the timely detection and removal of colorectal polyps. Precise diagnosis necessitates the accurate segmentation of these polyps, a task where existing deep learning solutions exhibit limitations. Specifically, many CNN-based models emphasize local information, hence their receptive field is constrained by the kernel size, impairing their effectiveness on larger polyps. Conversely, vision transformer-based models replace the CNN-based encoder with transformers to avail more potent global contextual representations, however, the segmentation still hinges on a CNN-centric decoder. Bridging this gap, we introduce the W-shaped Multi-Attention Assisted Transformer (WMAFormer) for polyp segmentation, which employs transformer modules in lieu of conventional convolutional blocks within the decoder. Structurally, our encoder harnesses the pyramid vision transformer’s capabilities, while our decoder amalgamates three pivotal modules: Reference Feature Extractor (RFE), Semantic Feature Enhancement (SFE), and Reverse Attention Decoder (RAD). Notably, the SFE module employs mutual and dual attention mechanisms to augment shared information across varying scales and channels of feature maps. This enhancement necessitates a robust reference map, a responsibility vested in the RFE. Subsequent to this refinement, the improved feature map is channeled to the RAD, which employs reverse attention operations to yield the final prediction. Throughout this architecture, attention mechanisms remain paramount, safeguarding the preservation of global information. Our comprehensive evaluation spanning five prominent datasets showcases the model’s prowess, with both quantitative and numerical results that commendably outpace several contemporary state-of-the-art semantic segmentation methods.
Conference Presentation
© (2024) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
M. Yi, Y. Su, Y. Shen, and W. Wang "W-MAFormer: W-shaped multi-attention assisted transformer for polyp segmentation", Proc. SPIE 12927, Medical Imaging 2024: Computer-Aided Diagnosis, 129270V (3 April 2024); https://doi.org/10.1117/12.3008772
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Polyps

Feature extraction

Transformers

Image segmentation

Machine learning

Convolution

Back to Top