Paper
11 September 2024 Pruning network for human pose estimation based on multiscale cross-attention
Tiansheng Hu, Xin Li, Song Wang, Enqing Chen
Author Affiliations +
Proceedings Volume 13253, Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024); 132531B (2024) https://doi.org/10.1117/12.3041563
Event: Fourth International Conference on Signal Image Processing and Communication, 2024, Xi'an, China
Abstract
Human pose estimation serves as a critical foundation for various subsequent tasks such as human-computer interaction, motion analysis, and action recognition. Current human pose estimation networks often demand larger model parameters and computational resources to continuously enhance estimation accuracy, posing challenges for implementation on low-powered edge computing devices. This study, through experimentation, identified a prevalent issue of redundancy in convolution channels within existing human pose estimation network models. Notably, strong similarities were observed among feature maps output by convolution channels. In response to these challenges, this paper introduces a lightweight yet accurate human pose estimation network model, designed to be applicable to most edge computing devices while maintaining high estimation accuracy. The proposed model initiates improvements to ordinary convolutions in the network by employing a reparametrizable partial convolution for redundant reduction. Simultaneously, it enriches the diversity of the extracted features. Furthermore, an effective multiscale cross-attention mechanism is designed to fuse features at different stages of the backbone network. This approach enhances accuracy while mitigating the severe decrease in inference speed associated with excessive multiscale fusion. Through these design strategies, the proposed model achieves a balance between accuracy and speed, with a smaller computational and parameter footprint. Experimental validation on the COCO and MPII datasets verifies the effectiveness of the proposed method.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Tiansheng Hu, Xin Li, Song Wang, and Enqing Chen "Pruning network for human pose estimation based on multiscale cross-attention", Proc. SPIE 13253, Fourth International Conference on Signal Image Processing and Communication (ICSIPC 2024), 132531B (11 September 2024); https://doi.org/10.1117/12.3041563
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Pose estimation

Convolution

Education and training

Feature extraction

Feature fusion

Data modeling

Transformers

Back to Top