Paper
10 October 2023 Residual-based feature enhancement for forgery audio detection
Author Affiliations +
Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 127993B (2023) https://doi.org/10.1117/12.3005833
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia
Abstract
In recent years, speech synthesis technology has become increasingly advanced, leading to a proliferation of forged audio content on the internet, which poses significant threat to individuals and society. Many studies have utilized a range of deep learning-based techniques to differentiate fake audio content, but the features used in these studies are often limited in their rich and generalizable characteristics. In this paper, we propose a novel fake voice detection technology that utilizes the wav2vec2 model for feature extraction along with a custom-designed residual-based detection module to augment the detection of fake audio content with greater accuracy and precision. Additionally, we incorporate a data augmentation method to improve the performance of the model and enhance its ability to generalize. We trained our model on the ASVspoof2019 dataset and evaluated it on the LA and DF datasets of the ASVspoof2021 dataset. Supplementary experiments demonstrated that our approach achieved state-of-the-art detection performance and illustrated its effectiveness and applicability.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Wei Zheng, Xia Ling, and Han Hai "Residual-based feature enhancement for forgery audio detection", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 127993B (10 October 2023); https://doi.org/10.1117/12.3005833
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Counterfeit detection

Data modeling

Feature extraction

Education and training

Machine learning

Performance modeling

Deep learning

Back to Top