引用本文
  • 任纪星,许葳,汪润,李勃衡,张钰洋,王丽娜.基于模型遗忘的深度神经网络鲁棒性水印方法[J].信息安全学报,已采用    [点击复制]
  • Ren Ji Xing,Xu Wei,Wang Run,Li Boheng,Zhang Yuyang,Wang Lina.A Robust Watermarking Scheme for Deep Neural Networks based on Machine Unlearning[J].Journal of Cyber Security,Accept   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 468次   下载 0  
基于模型遗忘的深度神经网络鲁棒性水印方法
任纪星, 许葳, 汪润, 李勃衡, 张钰洋, 王丽娜
0
(武汉大学)
摘要:
近些年来,深度神经网络(Deep Neural Networks, DNN)在许多前沿领域取得了巨大成功,比如图像、语音、自然语言处理。这些DNN模型为它们的开发公司团队带来了巨大经济的收益,同时,DNN模型的训练需要大量的数据资源和计算资源,其成本会随着模型参数量的增加而成倍增长。因此,一个训练良好的DNN模型对其所有者具有很高的价值。但不幸的是,高价值、训练良好的DNN模型正受到各种模型窃取攻击、滥用和非法分发等安全威胁。深度神经网络水印是一种保护模型版权的重要手段,根据水印是否嵌入到模型参数中,深度神经网络水印可以分为静态水印和动态水印。静态水印由于在验证过程中需要白盒权限难以在实际应用中使用,而通过向深度神经网络模型中添加验证样本和标签对映射的动态水印技术范式,面临着难以抵御水印移除攻击的威胁。因此现有水印方法存在鲁棒性不足的问题,导致其在实际部署应用中存在着大量风险与安全隐患。本文提出了一种基于模型遗忘的深度神经网络鲁棒性水印方法。与现有方法不同,该方法通过使用模型遗忘技术消除原有的样本映射嵌入水印,代替传统的添加样本标签映射方式,规避水印移除攻击的消除,从而极大提高了水印的鲁棒性。具体来说,该方法使用基于样本相似度的样本选择方法筛选需要遗忘的样本,再通过梯度上升策略有针对性地遗忘部分训练样本的映射关系,以提高水印的鲁棒性。该方法有效应对多种水印移除攻击,并在CIFAR-10、CIFAR-100和TinyImageNet三个数据集上的实验中表现出优异的鲁棒性,面对多种水印移除攻击,本文方法水印提取有效性平均超过98%。
关键词:  深度神经网络模型,版权保护,深度神经网络水印,模型遗忘
DOI:
投稿时间:2024-08-10修订日期:2024-10-25
基金项目:国家重点研发计划青年科学家项目(2021yfb3100700)、国家自然科学基金项目(62202340、62372334)、河南省网络空间态势感知重点实验室开放课题基金重点项目(HNTS2022004)、武汉市知识创新计划项目(2022010801020127)、中央高校基本科研业务费专项(2042023kf0121)、CCF-绿盟科技“鲲鹏”科研基金(CCF-NSFOCUS 2023005)
A Robust Watermarking Scheme for Deep Neural Networks based on Machine Unlearning
Ren Ji Xing, Xu Wei, Wang Run, Li Boheng, Zhang Yuyang, Wang Lina
(Wuhan University)
Abstract:
In recent years, Deep Neural Networks (DNNs) have achieved significant success in many cutting-edge fields, such as image processing, speech recognition, and natural language processing. These DNN models have brought substantial economic benefits to their developers and teams. However, training DNN models requires extensive data and computational resources, with costs that multiply as model parameters increase. Consequently, a well-trained DNN model holds high value for its owner. Unfortunately, high-value, well-trained DNN models face various security threats, including model theft, misuse, and unauthorized distribution. DNN watermarking has become an essential means of protecting model copyrights. Based on whether the watermark is embedded into the model parameters, DNN watermarking can be classified into static and dynamic watermarks. Due to the need for white-box access during verification, static watermarking is challenging to use in practical applications. Dynamic watermarking, which involves adding validation samples and label mappings to DNN models, is vulnerable to watermark removal attacks. Consequently, existing watermarking methods often lack robustness, posing significant risks and security concerns in real-world deployments. This paper proposes a robust watermarking method for DNNs based on machine unlearning. Unlike existing methods, this approach leverages machine unlearning techniques to eliminate original sample mapping-embedded watermarks, replacing traditional sample-label mapping methods to prevent watermark removal attacks, thereby greatly enhancing watermark robustness. Specifically, this method uses a sample selection technique based on sample similarity to identify samples that need to be forgotten. It then selectively forgets the mapping relationships of certain training samples using a gradient ascent strategy to improve watermark robustness. This method effectively counters multiple watermark removal attacks and demonstrates excellent robustness in experiments conducted on CIFAR-10, CIFAR-100, and TinyImageNet datasets, achieving an average watermark extraction accuracy exceeding 98% in the face of various watermark removal attacks.
Key words:  Deep Neural Networks, Copyright Protection, DNN Watermarking, Machine Unlearning