基于多伪造模型对抗优化的鲁棒图像扰动

穆文鹏; 王金伟; 陈北京; 聂丽娜; 莫皓岚; 徐菲

本文已被：浏览 901次下载 692次	码上扫一扫！
基于多伪造模型对抗优化的鲁棒图像扰动
穆文鹏,王金伟,陈北京,聂丽娜,莫皓岚,徐菲
分享到：微信更多字体:加大+\|默认\|缩小-
(南京信息工程大学计算机学院、网络空间安全学院南京中国 210044;南京信息工程大学数字取证教育部工程研究中心南京中国 210044)

摘要:

深度人脸伪造技术,即用于创造虚假面部特征或整体重塑面部的恶意手段,对个人声誉和社会安全构成重大威胁。为了应对这一挑战,部分研究者提出运用对抗性扰动来抵制深度人脸伪造模型的攻击,该方法旨在使这些深度人脸伪造模型的输出严重污损,以防范其伪造能力。尽管加入对抗扰动的确导致生成的伪造面部图像与真实图像存在显著的视觉差异,但这些对抗扰动仅能对在训练过程中已知的特定深度人脸伪造模型提供较为有效的保护,难以涵盖多方面的威胁。为解决这一问题,本文提出训练多模型对抗扰动的方案,并且加入训练的多个模型内部结构千差万别,以使得到的对抗扰动可以通用地护卫面部图像免受多种已知和未知伪造模型的侵害。首先,我们建立了一种多模型并行攻击管道,同时对多个深度人脸伪造模型进行对抗攻击,不断强化对各模型的抗性。其次,我们创新性地设计了显著性偏向融合策略,旨在缓解不同管道分支造成的对抗性扰动冲突。此外,为提升模型的泛化能力和更好模拟实际图像拍摄情景,我们还在原始图像中采用数据增强技术。最后,在并行攻击管道分支融合对抗扰动后,采用启发式的树形PARZEN方法自动寻找最适攻击步长,以进一步缓解模型间扰动的兼容性问题。大量实验验证显示,我们所提出的多模型对抗性扰动能有效污损多种深度人脸伪造模型生成的伪造人脸图像,并在鲁棒性、高效性和泛化性上取得较好的平衡。

关键词: 多模型通用图像扰动对抗攻击深度伪造机器学习

DOI：10.19363/J.cnki.cn10-1380/tn.2025.07.04

投稿时间：2023-12-14修订日期：2024-03-07

基金项目:本课题得到国家自然科学基金(No. 62072250, No. 62172435, No. U1804263, No. U20B2065, No. 61872203, No. 71802110, No. 61802212);中原科技创新领军人才项目(No. 214200510019);国家重点研发计划(No. 2021QY0700);江苏自然科学基金(No. BK20200750);河南省网络空间态势感知重点实验室开放基金(No. HNTS2022002);江苏省研究生研究与实践创新项目(No. KYCX200974);山东省计算机网络重点实验室开放课题基金(No. SDKLCN-2022-05);人文社会科学教育部项目(No. 19YJA630061);国家级大学生创新创业训练计划支持项目(No. 202310300021Z)资助。

Robust Image Perturbation Based on Multi-Forgery Model Adversarial Optimization

MU Wenpeng,WANG Jinwei,CHEN Beijin,NIE Lina,MO Haolan,XU Fei

School of Computer Science and Cyberspace Security, Nanjing University of Information Science & Technology, Nanjing 210044, China;Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science & Technology, Nanjing 210044, China

Abstract:

Deepfakes, malicious means used to create false facial features or completely reshape faces, pose a major threat to personal reputation and social security. In order to deal with this challenge, some researchers have proposed using adversarial perturbations to resist the attacks of deep face forgery models. This method aims to seriously taint the output of these deep face forgery models to prevent their forgery capabilities. Although adding adversarial perturbations does lead to significant visual differences between the generated forged facial images and real images, these adversarial perturbations can only provide relatively effective protection for specific depth face forgery models known during the training process, and are difficult to covers multiple threats. In order to solve this problem, this article proposes a scheme to train multiple models against perturbation, and the internal structures of the multiple models added to the training are widely different, so that the resulting adversarial perturbation can universally protect facial images from a variety of known and unknown forged models. infringement. First, we established a multi-model parallel attack pipeline to conduct adversarial attacks on multiple deep face forgery models at the same time, continuously strengthening the resistance to each model. Secondly, we innovatively design a saliency-biased fusion strategy, aiming to alleviate the adversarial perturbation conflicts caused by different pipeline branches. In addition, in order to improve the generalization ability of the model and better simulate actual image shooting scenarios, we also use data enhancement technology in the original images. Finally, after the parallel attack pipeline branches are fused to combat perturbations, the heuristic tree-shaped PARZEN method is used to automatically find the optimal attack step size to further alleviate the compatibility problem of perturbations between models. Extensive experimental verification shows that the multi-model adversarial perturbation we proposed can effectively deface forged face images generated by multiple deep face forgery models, and achieves a good balance in robustness, efficiency and generalization.

Key words: multi-model general image perturbation adversarial attacks deepfake machine learning