基于分治法的神经网络修复方法

孙朔; 严俊; 晏荣杰

引用本文：

孙朔,严俊,晏荣杰.基于分治法的神经网络修复方法[J].信息安全学报,2023,8(3):27-37 [点击复制]
SUN Shuo,YAN Jun,YAN Rongjie.A Neural Network Repair Method Based on Divide-and-Conquer[J].Journal of Cyber Security,2023,8(3):27-37 [点击复制]

本文已被：浏览 5987次下载 3887次	码上扫一扫！
基于分治法的神经网络修复方法
孙朔^1,2, 严俊^1,3,2, 晏荣杰^3,2
0 字体:加大+\|默认\|缩小-
(1.中国科学院软件研究所软件工程技术研究开发中心北京中国 100190;2.中国科学院大学北京中国 100049;3.计算机科学国家重点实验室北京中国 100190)

摘要:

神经网络作为一种求解复杂问题的有效方法已经广泛应用于医学影像,自动驾驶等领域。然而,神经网络十分脆弱,对一个样本添加一点肉眼难以察觉的微小扰动就可能导致神经网络做出错误的判断。当神经网络出现了错误的行为,常用的修复方法是对神经网络进行重训练或者微调,然而这些方式需要较高的代价而且无法保证完全修复错误行为。在本文中,我们关注神经网络的完备修复问题,给定一个待修复的神经网络和一个目标样本集合,该问题要求修复后的神经网络在目标样本集合上表现出100%的正确率。在本文中,我们基于分治法的思想提出了一种神经网络修复方法。在该方法中,我们将目标样本集合不断划分为更小的集合,直到样本集合达到可接受的规模,之后对于划分得到的每一个集合逐个进行修复得到一个局部补丁,最后所有的局部补丁进行整合得到对于整个特征空间的补丁。在两个公开数据集上的实验表明我们的方法优于当前最先进的神经网络修复算法。针对对抗攻击和后门攻击生成的目标样本集合,我们的方法不仅完全修复了神经网络在目标样本集合上的行为,而且将网络在相同攻击方式生成的测试集上的准确率分别提高了55.79%和60.59%。同时,我们的方法可以避免修复后网络在标准测试集上的准确率大幅度降低。

关键词: 错误修复|神经网络|分治法|约束求解

DOI：10.19363/J.cnki.cn10-1380/tn.2023.05.03

投稿时间：2022-09-10修订日期：2022-12-28

基金项目:本课题得到国家自然科学基金项目(No. 62132020)和中国科学院前沿科学重点研究计划(No. QYZDJSSW-JSC036)资助。

A Neural Network Repair Method Based on Divide-and-Conquer

SUN Shuo^1,2, YAN Jun^1,3,2, YAN Rongjie^3,2

(1.Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China;3.State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China)

Abstract:

As an effective method for solving complex problems, the neural network has been widely used in medical imaging, autonomous driving, and other fields. However, neural networks are very fragile, and adding a tiny perturbation to a sample can cause the neural network to make wrong judgments. When the neural network has erroneous behavior, the common repair method is to retrain or fine-tune the neural network, but these methods require high costs and cannot guarantee complete repair of the erroneous behavior. In this paper, we focus on the problem of the complete repair of neural networks. Given a neural network to be repaired and a target sample set, the problem requires the repaired neural network to exhibit 100% accuracy on the target sample set. In this paper, we propose a neural network repair method based on the idea of divide and conquer. In this method, we continuously divide the target sample set into smaller sets until the sample set reaches an acceptable size and then repair each set obtained by division one by one to obtain a local patch, and finally integrate all the local patches to get a patch for the entire feature space. Experiments on two public datasets demonstrate that our method outperforms current state-of-the-art neural network repair algorithms. For the target sample set generated by the adversarial attack and backdoor attack, our method not only completely repairs the behavior of the neural network on the target sample set but also improves the accuracy of the network on the test set generated by the same attack method by 55.79% and 55.79%, respectively. 60.59%. At the same time, our method can avoid a large reduction in the accuracy of the repaired network on the standard test set.

Key words: bug fixing|neural network|divide and conquer|constraint solving