Yolo series models are extensive within the domain of object detection. Aiming at the challenge of small object detection, we analyze the limitations of existing detection models and propose a Vis-YOLO object detection algorithm based on YOLOv8s. First, the down-sampling times are reduced to retain more features, and the detection head is replaced to adapt to the small object. Then, deformable convolutional networks are used to improve the C2f module, improving its feature extraction ability. Finally, the separation and enhancement attention module is introduced to the model to give more weight to the useful information. Experiments show that the improved Vis-YOLO model outperforms the YOLOv8s model on the visdrone-2019 dataset. The precision improved by 5.4%, the recall by 6.3%, and the mAP50 by 6.8%. Moreover, Vis-YOLO models are smaller and suitable for mobile deployment. This research provides a new method and idea for small object detection, which has excellent potential application value.
White blood cells are a core component of the immune system, responsible for protecting the human body from foreign invaders and infectious diseases. A decrease in the white blood cell count can lead to weakened immune function, increasing the risk of infection and illness. However, determining the number of white blood cells usually requires the expertise and effort of radiologists. In recent years, with the development of image processing technology, biomedical systems have widely applied image processing techniques in disease diagnosis. We aim to classify the subtypes of white blood cells using image processing technology. To improve the ability to extract fine information during the feature extraction process, the spatial prior convolutional attention (SPCA) module is proposed. In addition, to enhance the connection between features at distant distances, the Shifted Window (Swin) Transformer network is used as the backbone for feature extraction. The SGTformer network for white blood cell subtype classification is proposed by combining recursive gate convolution and SPCA modules. Our method is validated on the white blood cell dataset, and the experimental results demonstrate an overall accuracy of 99.47% in white blood cell classification, surpassing existing mainstream classification algorithms. It is evident that this method can effectively accomplish the task of white blood cell classification and provide robust support for the health of the immune system.
To guarantee the safety and efficiency of industrial production and prevent accidents or losses caused by personnel negligence or negligence, this work proposes a personnel on-duty status recognition method. The method combines a human pose estimation algorithm and a target detection algorithm, which can automatically discriminate six states of personnel on duty. First, the original image is processed using a high-resolution network (HRNet) to generate human pose keypoint maps. Then SE-VGG16 is constructed by combining the squeeze-excitation network and VGG16 for feature extraction of human pose keypoint maps. Finally, the design of the lightweight convolutional neural network for primary classification and you only look once version 5 is used for reclassification for behaviors with similar action features. The experimental results show that the method has an average recognition accuracy of 98.27% with good robustness and generalization ability for six kinds of personnel on-duty status in multiple environments.
In the process of image shooting, due to the shooting angle or the shooting reason, the original image has geometric deformation problems in the geometric position, shape, size and orientation, which brings many inconvenience and difficulties to the following image processing fields such as image fusion, image denoising, image recognition, and image edge detection. In order to further improve the image processing ability and recognition accuracy of distorted graphics, this paper proposes a distortion correction algorithm based on block mapping for quadrilateral target images. This algorithm completes the transformation of the image from irregular to regular by dividing the quadrilateral image into two triangle images and performing homography mapping on the pixel points of the triangle image respectively. A large number of experiments show that the algorithm in this paper is superior to traditional methods such as hough and radon transform, and solves the limitations of traditional mapping algorithm in the process of quadrilateral image correction, improves the correction effect of distorted images, and provides a strong support for the subsequent processing of distorted images.
In clinical medicine, the liver segmentation is indispensable for the diagnosis of liver diseases. The shape and size of the liver varies in CT images and the similar grayscale values with neighboring organ tissues, which cause difficulties for segmentation. For these problems, we propose a network for the segmentation of liver CT images, which based on encoder-decoder structure. The network applies the SE-Res block instead of the original convolution block to optimize the boundary information and apply the spatial-channel attention gate to enhance the features of the liver in the decoder. The proposed algorithm was validated on the LITS-28 dataset, the Mean Intersection over Union (MIOU) and Dice similarity coefficient (Dice) were 93.68% and 96.45%, respectively. Compared with other similar algorithms, the performance of the proposed algorithm is better and the segmented liver results are more accurate.
Aiming at the problem that there may be one or more diseases and unbalanced distribution of labels in fundus images, in this paper proposes a multi-label classification method for fundus diseases based on the fusion of meta-data and EB-IRV2 network. Firstly, Efficientnet-B2 and InceptionResNetV2 networks are used to extract feature information from the left and right fundus image data, and then fuse with the meta-data with patient information, finally send them to the classifier for multi-label classification of fundus diseases. Adding patient’s meta-information into the model helps to better capture the lesion information and the location of the lesion in the fundus image, thus improving the accuracy of recognition. The experimental results show that the model in this paper achieves good classification results on the ODIR fundus image database, the accuracy rate is 96.00%, the recall rate is 92.37% and the F1-score is 94.11%, indicating that the proposed model has good robustness in the classification of multi-labeled fundus images.
License plate (LP) location is a key technology in the process of license plate recognition (LPR). How to realize the LP location and character segmentation of complex vehicle images has always been a hot issue in the research of intelligent transportation systems. In this paper, a novel LP location method combining the connected domain (CD) slope detection method and the dynamic template matching method is proposed for the complicated situations that the tilt angle of the LP is too large, the LP is defaced, the LP frame is unclear and the characters are adhesive or defective. Firstly, the method uses the CD slope detection method to find out the equal-slope CD of the LP characters in the optimal segmentation image after pre-processing and determine the tilt angle. Then locate the horizontal region of the LP and realize the tilt correction. After that, the proposed the dynamic template matching method is used to locate the vertical position of the LP and segment the LP characters accurately. Finally, the experiment proves that the proposed algorithm reduces the recognition difficulty of the LP with the above problems, and has the characteristics of fast speed, accurate recognition, good adaptability and strong anti-interference. It also has good versatility and scalability for the newly introduced new energy LP with 8-character.
License plate segmentation is a key technology in the process of license plate location and recognition. How to realize automatic segmentation of license plate image under complex illumination conditions has been a hot issue in intelligent transportation system (ITS). This paper deals with license plate image segmentation under a variety of lighting conditions. Based on the adaptive segmentation of license plate images by the Pulse Coupled Neural Network (PCNN), the relationship between the license plate image contrast and the PCNN iteration entropy is analyzed. An adaptive segmentation algorithm for license plate image using Deep Neural Network (DNN) to select the optimal result is proposed, and the selected segmentation image is filtered by the connected domain, which lays a foundation for subsequent license plate location, character segmentation and recognition. Simulation experiments show that the proposed algorithm performs better license plate segmentation and optimal selection for license plate images under various lighting conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.