Increasing depth, distribution distillation, and model soup: erasing backdoor triggers for deep neural networks

Yijian Zhang; Tianxing Zhang; Qi Liu; Guangling Sun; Hanzhou Wu

doi:10.1117/1.JEI.31.6.063005

8 November 2022 Increasing depth, distribution distillation, and model soup: erasing backdoor triggers for deep neural networks

Yijian Zhang, Tianxing Zhang, Qi Liu, Guangling Sun, Hanzhou Wu

Author Affiliations +

Journal of Electronic Imaging, Vol. 31, Issue 6, 063005 (November 2022). https://doi.org/10.1117/1.JEI.31.6.063005

Abstract

Deep neural networks are vulnerable to backdoor attacks, in which the adversary injects a trigger embedded set into the training process. Inputs marked with the trigger provide incorrect predictions, whereas clean inputs remain unaffected. To erase the latent triggers in models, increasing depth, distribution distillation, and model soup (ID3MS), a defensive solution that operates without prior knowledge of triggers and relies on a small clean set is introduced. The depth of the backdoor model is increased by adding fully connected layer(s) at the penultimate layer. Without a classification layer, the original backdoor and increased depth models are considered as teacher and student, respectively. The student model applies distribution distillation to refit the distribution of the clean set and erase the backdoor triggers. The distilled student model is then recovered with the classification layer and model soup is used to ensemble a collection of models generated by various fine-tuning hyperparameters. The experimental results validate the superior performance of the ID3MS compared with existing defensive techniques against several attacks across datasets.

Citation Download Citation

Yijian Zhang, Tianxing Zhang, Qi Liu, Guangling Sun, and Hanzhou Wu "Increasing depth, distribution distillation, and model soup: erasing backdoor triggers for deep neural networks," Journal of Electronic Imaging 31(6), 063005 (8 November 2022). https://doi.org/10.1117/1.JEI.31.6.063005

Received: 25 June 2022; Accepted: 17 October 2022; Published: 8 November 2022

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available