引用本文: |
-
刘佳彤,张铭诚,王丽娜,汪润,叶茜.基于扩散模型的深度伪造人脸图像对抗扰动净化算法[J].信息安全学报,已采用 [点击复制]
- LIU Jiatong,ZHANG Mingcheng,WANG Lina,WANG Run,YE Xi.Diffusion Model-based Adversarial Purification Algorithm for DeepFake Facial Images[J].Journal of Cyber Security,Accept [点击复制]
|
|
摘要: |
随着深度伪造技术的发展,大量逼真的伪造图像充斥人们的生活,为了降低深度伪造技术滥用带来的潜在威胁,深度伪造检测器应运而生。这些检测器通常基于深度神经网络的方法来检测伪造图像中的细微差异,容易受到对抗攻击的影响,攻击者通过简单的噪声添加就可以欺骗检测器。因此,为了提升检测器面对对抗攻击的鲁棒性,有效的对抗扰动净化方法亟待研究。然而,现有的对抗扰动净化算法在消除对抗扰动时,存在容易破坏脆弱的深度伪造线索,以及引入新的伪造痕迹的问题,依然会造成检测器的误判。为了解决这些挑战,本文提出一种基于扩散模型的双通道对抗扰动净化算法。首先,设计基于GridNet的对抗样本生成网络,引导模型学习到更丰富的对抗特征,提升用于训练模型的对抗样本数据集的多样性;然后,从多任务学习、多层次监督的角度出发设计双通道对抗扰动净化模型,利用净化通道和辅助生成通道充分学习净化样本与对抗样本间的差异,以达到保留伪造样本中的伪造线索而消除对抗扰动的目的,并且利用扩散模型强大的生成能力,避免在净化真实样本时引入额外的伪造痕迹;最后,设计额外监督模块保证双通道净化模型不同生成任务的有效完成。在经典的深度伪造数据集FaceForensics++和Celeb-DF-v2中的大量实验结果表明,针对黑盒的对抗攻击方案ProS-GAN、advDeepFake、AdvShadow和ARNet,经过本文算法净化后,通过基于空域、频域、生理信号的检测器,检测准确率均提升至90.55%以上,实现了有效的对抗扰动消除,并且在频域检测器上具有明显优势。 |
关键词: 深度伪造 深度伪造检测 对抗攻击 对抗净化 |
DOI: |
投稿时间:2024-11-26修订日期:2025-02-18 |
基金项目:国家自然科学基金项目(面上项目),国家重点研究发展计划,国家自然科学基金项目(面上项目,重点项目,重大项目),国家重点基础研究发展计划(973计划) |
|
Diffusion Model-based Adversarial Purification Algorithm for DeepFake Facial Images |
LIU Jiatong, ZHANG Mingcheng, WANG Lina, WANG Run, YE Xi
|
(Wuhan University) |
Abstract: |
With the advancement of DeepFaks, a large number of realistic fake images are flooding into our daily life, posing signif-icant potential threats due to the misuse of DeepFaks. To mitigate these threats, DeepFake detectors have emerged. These detectors are typically based on deep neural networks, aim to detect subtle discrepancies in fake images but are vulnera-ble to adversarial attacks. Attackers can easily deceive these detectors by adding simple noise. Therefore, to enhance the robustness of detectors against adversarial attacks, effective adversarial purification methods are urgently needed. How-ever, existing adversarial purification algorithms often disrupt fragile DeepFake traces or introduce new fake artifacts while eliminating adversarial perturbations, leading to misjudgments by detectors. To address these challenges, a du-al-channel adversarial purification algorithm is proposed, which is based on diffusion model. First, a GridNet-based ad-versarial sample generation network is designed to guide the model in learning richer adversarial features, enhancing the diversity of the adversarial sample dataset used for training. Second, from the perspectives of multi-task learning and multi-level supervision, a dual-channel adversarial purification model is designed. This model leverages a purification channel and an auxiliary generation channel to thoroughly learn the differences between purified samples and adversarial samples. This model aims to preserve fake traces in DeepFake samples while eliminating adversarial perturbations. Addi-tionally, the powerful generative capabilities of the diffusion model prevent the introduction of extra forgery artifacts during the purification process of real samples. Finally, an additional supervision module is introduced to ensure the effec-tive completion of different generation tasks by the dual-channel purification model. Extensive experimental results on the classic DeepFake datasets, FaceForensics++ and Celeb-DF-v2, demonstrate that for black-box adversarial attack schemes ProS-GAN, advDeepFake, AdvShadow, and ARNet, the proposed algorithm significantly enhances detection accuracy. After purification using the proposed method, achieving detection accuracy rates exceeding 90.55% across spa-tial-based, frequency-based, and physiological signals-based detectors. The method not only effectively eliminates ad-versarial perturbations but also exhibits notable advantages in frequency-domain detectors. |
Key words: DeepFake DeepFake detection adversarial attacks adversarial purification |