一种基于局部扰动的图像对抗样本生成方法

王辛晨; 苏秋旸; 杨邓奇; 陈本辉; 李晓伟

引用本文：

王辛晨,苏秋旸,杨邓奇,陈本辉,李晓伟.一种基于局部扰动的图像对抗样本生成方法[J].信息安全学报,2022,7(6):94-104 [点击复制]
WANG Xinchen,SU Qiuyang,YANG Dengqi,CHEN Benhui,LI Xiaowei.A Method of Image Adversarial Sample Based on Local Disturbance[J].Journal of Cyber Security,2022,7(6):94-104 [点击复制]

本文已被：浏览 3517次下载 3006次	码上扫一扫！
一种基于局部扰动的图像对抗样本生成方法
王辛晨, 苏秋旸, 杨邓奇, 陈本辉, 李晓伟
0 字体:加大+\|默认\|缩小-
(大理大学数学与计算机学院大理中国 671000)

摘要:

近年来, 随着人工智能的研究和发展, 深度学习被广泛应用。深度学习在自然语言处理、计算机视觉等多个领域表现出良好的效果。特别是计算机视觉方面, 在图像识别和图像分类中, 深度学习具备非常高的准确性。然而越来越多的研究表明, 深度神经网络存在着安全隐患, 其中就包括对抗样本攻击。对抗样本是一种人为加入特定扰动的数据样本, 这种特殊样本在传递给已训练好的模型时, 神经网络模型会输出与预期结果不同的结果。在安全性要求较高的场景下, 对抗样本显然会对采用深度神经网络的应用产生威胁。目前国内外对于对抗样本的研究主要集中在图片领域, 图像对抗样本就是在图片中加入特殊信息的图片数据, 使基于神经网络的图像分类模型做出错误的分类。已有的图像对抗样本方法主要采用全局扰动方法,即将这些扰动信息添加在整张图片上。相比于全局扰动, 局部扰动将生成的扰动信息添加到图片的非重点区域, 从而使得对抗样本隐蔽性更强, 更难被人眼发现。本文提出了一种生成局部扰动的图像对抗样本方法。该方法首先使用 Yolo 目标检测方法识别出图片中的重点位置区域, 然后以 MIFGSM 方法为基础, 结合 Curls 方法中提到的先梯度下降再梯度上升的思想,在非重点区域添加扰动信息, 从而生成局部扰动的对抗样本。实验结果表明, 在对抗扰动区域减小的情况下可以实现与全局扰动相同的攻击成功率。

关键词: 对抗样本局部扰动目标检测神经网络

DOI：10.19363/J.cnki.cn10-1380/tn.2022.11.06

投稿时间：2022-07-07修订日期：2022-10-13

基金项目:本课题得到国家自然科学基金(No. 61902049, No. 62262001)资助

A Method of Image Adversarial Sample Based on Local Disturbance

WANG Xinchen, SU Qiuyang, YANG Dengqi, CHEN Benhui, LI Xiaowei

(School of Mathematics and Computer, Dali University, Dali 671000, China)

Abstract:

In recent years, with the research and development of artificial intelligence, deep learning has been widely used. Deep learning has shown good results in many fields such as natural language processing and computer vision. Especially in computer vision, deep learning has a very high accuracy in image recognition and image classification. However, more and more studies show that deep neural networks have security risks, including adversarial sample attack. Adversarial sample is a kind of data sample artificially added with specific perturbations. When this special sample is passed to the trained model, the neural network model will output different results from the expected results. In scenarios with high security requirements, adversarial samples will obviously pose a threat to applications using deep neural networks. At present, the research on adversarial samples at home and abroad is mainly focused on the field of images. Image adversarial samples are the image data with special information added to the pictures, so that the image classification model based on neural networks can make the wrong classification. The existing image adversarial sample methods mainly use the global disturbance method, that is, the disturbance information is added to the whole image. Compared with the global disturbance, the local disturbance adds the generated disturbance information to the non-key area of the image, which makes the adversarial sample more hidden and harder to be found by the human eye. In this paper, an image adversarial sample method for generating local disturbances is proposed. The method first used Yolo object detection method to identify the focal areas in images. Then, based on the MIFGSM method and the thought of gradient descent followed by gradient ascending mentioned in the Curls methods, the perturbation information was added to the non-focal areas to generate local perturbation adversarial samples. The experimental results show that the same attack success rate as the global disturbance can be achieved when the anti-disturbance area is reduced.

Key words: adversarial sample local disturbance target detection the neural network key word