The surface defects of industrial materials can seriously affect product quality, so that the industry has the demand for high precision defect detection algorithms. Towards this issue, we investigate the methods of defect detection accuracy improvement and collaborative training. This paper first innovatively proposes the multi-attention fusion mechanism (MAF), which integrates both channel and space dimensions, and embeds the spatial pyramid structure into the attention module. It alleviates the problem of inconspicuous defective features and enhances the feature extraction ability. Secondly, this paper proposes the mixForm data augmentation algorithm to transform the target defects in space and shape to tackle the problem of few samples. The detection model's ability to recognize defects of multiple types and small objects is simultaneously improved. Thirdly, the split federated learning (SFL) framework enables collaborative training of industrial surface defect detection models with a low resource cost. Our scheme improves model training efficiency and achieves high accuracy detection for small amounts of defect samples. Finally, the experimental results show that MAF with the aid of mixForm achieves 82.91 mAP on the NEU-DET dataset. Using MAF, the defect detection algorithm achieves at least 1.89 mAP improvement over using other attention mechanisms. The experiments also demonstrate that SFL achieves faster convergence and higher detection performance than traditional federated learning approaches.
Perceptual features, for example direction, contrast and repetitiveness, are important visual factors for human to perceive a texture. However, it needs to perform psychophysical experiment to quantify these perceptual features’ scale, which requires a large amount of human labor and time. This paper focuses on the task of obtaining perceptual features’ scale of textures by small number of textures with perceptual scales through a rating psychophysical experiment (what we call labeled textures) and a mass of unlabeled textures. This is the scenario that the semi-supervised learning is naturally suitable for. This is meaningful for texture perception research, and really helpful for the perceptual texture database expansion. A graph-based semi-supervised learning method called random multi-graphs, RMG for short, is proposed to deal with this task. We evaluate different kinds of features including LBP, Gabor, and a kind of unsupervised deep features extracted by a PCA-based deep network. The experimental results show that our method can achieve satisfactory effects no matter what kind of texture features are used.
In this paper, we propose two new problems related to classification of photographed document images, and based on deep learning methods, present the baseline solutions for these two problems. The first problem is that, for some photographed document images, which book do they belong to? The second one is, for some photographed document images, what is the type of the book they belong to? To address these two problems, we apply “AexNet” to the collected document images. Using the pre-trained “AlexNet” on the ImageNet data set directly, we obtain 92.57% accuracy for the book-name classification and 93.33% accuracy for the book-type one. After fine-tuning on the training set of the photographed document images, the accuracy of the book-name classification increases to 95.54% and that of the booktype one to 95.42%. To our best knowledge, although there exist many image classification algorithm, no previous work has targeted to these two challenging problems. In addition, the experiments demonstrate that deep-learning features outperform features extracted with traditional image descriptors on these two problems.
Since handwritten text lines are generally skewed and not obviously separated, text line segmentation of handwritten document images is still a challenging problem. In this paper, we propose a novel text line segmentation algorithm based on the spectral clustering. Given a handwritten document image, we convert it to a binary image first, and then compute the adjacent matrix of the pixel points. We apply spectral clustering on this similarity metric and use the orthogonal kmeans clustering algorithm to group the text lines. Experiments on Chinese handwritten documents database (HIT-MW) demonstrate the effectiveness of the proposed method.
Surface height map estimation is an important task in high-resolution 3D reconstruction. This task differs from general scene depth estimation in the fact that surface height maps contain more high frequency information or fine details. Existing methods based on radar or other equipments can be used for large-scale scene depth recovery, but might fail in small-scale surface height map estimation. Although some methods are available for surface height reconstruction based on multiple images, e.g. photometric stereo, height map estimation directly from a single image is still a challenging issue. In this paper, we present a novel method based on convolutional neural networks (CNNs) for estimating the height map from a single image, without any equipments or extra prior knowledge of the image contents. Experimental results based on procedural and real texture datasets show the proposed algorithm is effective and reliable.
Proceedings Volume Editor (2)
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.