KEYWORDS: Gesture recognition, Data modeling, Image segmentation, Feature extraction, Visual process modeling, Data processing, Image processing, Data fusion, Cameras, Sensors
Gesture recognition can play a crucial role in addressing the issue of Human-computer interaction. In this paper, we proposed a vision-based Multi-input fusion deep network (MIFD-Net), which consists of Multilayer Perceptron (MLP) and Convolutional Neural Networks (CNN). MIFD-Net first processes hand keypoint data and gesture images using Euclidean distance normalization (ED-Normalization) and image segmentation technologies, respectively. Then, two kinds of data are simultaneously used as input to MIFD-Net. The experimental results show that the MIFD-Net achieves an average accuracy of 99.65% on the self-built dataset in this paper and 99.10% on the NUS hand posture datasets II (NUS-II). The MIFD-Net significantly decreases its FLOPs and the number of parameters and reduces the complexity of the model while maintaining a high recognition rate compared with other gesture recognition models. The MIFD-Net can obtain high accuracy and strong robustness in different environments, lighting, and angles.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.