Proceedings Article | 13 June 2023
KEYWORDS: Image segmentation, Semantics, Data modeling, Deep learning, Head, Education and training, Visual process modeling, Computer vision technology, Performance modeling, Transformers
Mississippi and Alabama are the top two states producing and processing catfish in the United States, with the annual production of $382 million in 2022. The catfish industry supplies protein-rich catfish products to the U.S. market and contributes considerably to the development of the local economy. However, the traditional catfish processing heavily relies on human labors leading to a high demand of workforce in the processing facilities. De-heading, gutting, portioning, filleting, skinning, and trimming are the main steps of the catfish processing, which normally require blade-based cutting device (e.g., metal blades) to handle. The blade-based manual catfish processing might lead to product contamination, considerable fish meat waste, and low yield of catfish fillet depending on the workers’ skill levels. Furthermore, operating the cutting devices may expose the human labors to undesired work accidents. Therefore, automated catfish cutting process appears to be an alternative and promising solution with minimal involvement of human labors. To further enable, assist, and automate the catfish cutting technique in near real-time, this study presents a novel computer vision-based sensing system for segmenting the catfish into different target parts using deep learning and semantic segmentation. In this study, 396 raw and augmented catfish images were used to train, validate, and test five state-of-the-art deep learning semantic segmentation models, including BEiTV1, SegFormer-B0, SegFormer-B5, ViT-Adapter and PSPNet. Five classes were pre-defined for the segmentation, which could effectively guide the cutting system to locate the target, including the head, body, fins, tail of the catfish, and the image background. Overall, BEiTV1 demonstrated the poorest performance with 77.3% of mIoU (mean intersection-over-union) and 86.7% of MPA (mean pixel accuracy) among all tested models using the test data set, while SegFormer-B5 outperformed all others with 89.2% of mIoU and 94.6% of MPA on the catfish images. The inference speed for SegFormer-B5 was 0.278 sec per image at the resolution of 640x640. The proposed deep learning-based sensing system is expected to be a reliable tool for automating the catfish cutting process.