一种基于知识蒸馏的神经网络鲁棒性迁移方法

张维; 易平

本文已被：浏览 6618次下载 4549次	码上扫一扫！
一种基于知识蒸馏的神经网络鲁棒性迁移方法
张维,易平
分享到：微信更多字体:加大+\|默认\|缩小-
(上海交通大学网络空间安全学院上海中国 200240)

摘要:

近几年来，深度神经网络在多个领域展现了非常强大的应用能力，但是研究者们发现，通过在输入上添加难以察觉的扰动，可以改变神经网络的输出决策，这类样本被称为对抗样本。目前防御对抗样本，最常见的方法是对抗训练，但是对抗训练有着非常高的训练代价。我们提出了一种知识蒸馏的鲁棒性迁移方案（Robust-KD），结合特征图与雅克比矩阵约束，通过从鲁棒的网络中迁移鲁棒性特征，以比较低的训练代价，取得较强的白盒对抗防御能力。提出的算法在Cifar10、Cifar100与ImageNet数据集上进行了大量的实验，实验表明了我们方案的有效性，即使在非常强大的白盒对抗攻击下，我们的模型依然拥有不错的分类准确率。

关键词: 对抗样本模型鲁棒性迁移学习知识蒸馏

DOI：10.19363/J.cnki.cn10-1380/tn.2021.07.04

投稿时间：2020-10-14修订日期：2020-12-08

基金项目:本课题得到国家重点研发计划（No.2019YFB1405000）资助。

A Robust Transfer Method of Neural Network based on Knowledge Distillation

ZHANG Wei,YI Ping

School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Abstract:

In recent years, neural networks have shown very powerful performance in many fields, but researchers have found that by adding imperceptible interference to the input, neural network decisions can be changed. Such samples are called adversarial samples. At present, the most common method for defending adversarial examples is adversarial training, but the training cost of adversarial training is very high. We propose a knowledge purification scheme (Robust-KD) combining feature maps and Jacobian matrix constraints. By migrating robust features from a robust network, we can obtain considerable white box defense capabilities at relatively low training costs. We have conducted a lot of experiments on the Cifar10, Cifar100 and ImageNet datasets. Experiments have proved the effectiveness of the scheme. Even under a very powerful white box attack, our model still has good classification accuracy.

Key words: adversarial examples model robustness transfer learning knowledge distillation