引用本文: |
-
邓康,罗盛海,彭安杰,曾辉,黄晓芳.一种检测C&W对抗样本图像的盲取证算法[J].信息安全学报,2020,5(6):1-10 [点击复制]
- DENG Kang,LUO Shenghai,PENG Anjie,ZENG Hui,HUANG Xiaofang.Blind forensics of adversarial images generated by C&W algorithm[J].Journal of Cyber Security,2020,5(6):1-10 [点击复制]
|
|
摘要: |
对抗样本图像能欺骗深度学习网络,亟待对抗样本防御机制以增强深度学习模型的安全性。C&W攻击是目前较热门的一种白盒攻击算法,它产生的对抗样本具有图像质量高、可转移、攻击性强、难防御等特点。本文以C&W攻击生成的对抗样本为研究对象,采用数字图像取证的思路,力图实现C&W对抗样本的检测,拒绝对抗样本输入深度学习网络。基于对抗样本中的对抗扰动易被破坏的假设,我们设计了基于FFDNet滤波器的检测算法。具体来说,FFDNet是一种基于深度卷积网络CNN的平滑滤波器,它能破坏对抗扰动,导致深度学习模型对对抗样本滤波前后的输出不一致。我们判断输出不一致的待测图像为C&W对抗样本。我们在ImageNet-1000图像库上针对经典的ResNet深度网络生成了6种C&W对抗样本。实验结果表明本文方法能较好地检测C&W对抗样本。相较于已有工作,本文方法不仅极大地降低了虚警率,而且提升了C&W对抗样本的检测准确率。 |
关键词: 深度学习 对抗样本 数字图像取证 图像滤波 |
DOI:10.19363/J.cnki.cn10-1380/tn.2020.11.01 |
投稿时间:2019-12-31修订日期:2020-04-03 |
基金项目:本课题得到国家自然科学基金(No.61702429),四川省科技厅基金(No.19yyjc1656),四川省教育厅基金(No.17ZB0450)资助。 |
|
Blind forensics of adversarial images generated by C&W algorithm |
DENG Kang1, LUO Shenghai1, PENG Anjie1,2, ZENG Hui1,2, HUANG Xiaofang1
|
(1.School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China;2.Guangdong Key Laboratory of Information Security Technology, Sun Yat-Sen University, Guangzhou 510275, China) |
Abstract: |
Adversarial images which can fool Deep neural networks have attracted researchers to focus on how to harden DNNs against adversarial attacks. Among typical attack algorithms, the C&W attack is one of the strongest attacks, which ensures the attack success rates yet causes less adversarial perturbations on the original image, and is taken as a benchmark in defense attempts. In this paper, we employ the blind forensic methodology to detect C&W adversarial images, which aims to avoid adversarial inputs for deep neural networks. Supposing that the adversarial perturbations are easily damaged by some image processing operations, we proposed a detecting method by using the fast and flexible de-noising convolution neural network called FFDNet. Specially, we compare the model’s prediction on the test image and its filtered version. If the original and filtered inputs produce substantially different outputs from the model, the test image is likely to be adversarial. We employ ResNet as the targeted network, and generate 6 kinds of C&W adversarial images on ImageNet-1000 database. Experimental results show that the proposed method is effective in the detection of C&W adversarial images, and outperforms state-of-the-arts in terms of false positive rates and true positive rates. |
Key words: deep learning adversarial images digital image forensics image filtering |