引用本文
  • 张怡暄,李根,曹纭,赵险峰.基于帧间差异的人脸篡改视频检测方法[J].信息安全学报,2020,5(2):49-72    [点击复制]
  • ZHANG Yixuan,LI Gen,CAO Yun,ZHAO Xianfeng.A Method for Detecting Human-face-tampered Videos based on Interframe Difference[J].Journal of Cyber Security,2020,5(2):49-72   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 8917次   下载 9224 本文二维码信息
码上扫一扫!
基于帧间差异的人脸篡改视频检测方法
张怡暄, 李根, 曹纭, 赵险峰
0
(中国科学院信息工程研究所 信息安全国家重点实验室 北京 中国 1000932. 中国科学院大学 网络空间安全学院 北京 中国 100093)
摘要:
近几年,随着计算机硬件设备的不断更新换代和深度学习技术的不断发展,新出现的多媒体篡改工具可以让人们更容易地对视频中的人脸进行篡改。使用这些新工具制作出的人脸篡改视频几乎无法被肉眼所察觉,因此我们急需有效的手段来对这些人脸篡改视频进行检测。目前流行的视频人脸篡改技术主要包括以自编码器为基础的Deepfake技术和以计算机图形学为基础的Face2face技术。我们注意到人脸篡改视频里人脸区域的帧间差异要明显大于未被篡改的视频中人脸区域的帧间差异,因此视频相邻帧中人脸图像的差异可以作为篡改检测的重要线索。在本文中,我们提出一种新的基于帧间差异的人脸篡改视频检测框架。我们首先使用一种基于传统手工设计特征的检测方法,即基于局部二值模式(Local binary pattern,LBP)/方向梯度直方图(Histogram of oriented gradient,HOG)特征的检测方法来验证该框架的有效性。然后,我们结合一种基于深度学习的检测方法,即基于孪生网络的检测方法进一步增强人脸图像特征表示来提升检测效果。在FaceForensics++数据集上,基于LBP/HOG特征的检测方法有较高的检测准确率,而基于孪生网络的方法可以达到更高的检测准确率,且该方法有较强的鲁棒性;在这里,鲁棒性指一种检测方法可以在三种不同情况下达到较高的检测准确率,这三种情况分别是:对视频相邻帧中人脸图像差异用两种不同方式进行表示、提取三种不同间隔的帧对来计算帧间差异以及训练集与测试集压缩率不同。
关键词:  视频篡改  篡改检测  帧间差异  孪生网络  Deepfake  Face2face
DOI:10.19363/J.cnki.cn10-1380/tn.2020.02.05
投稿时间:2019-12-20修订日期:2020-03-09
基金项目:本课题得到国家重点研发计划课题(No.19QY2202,No.19QY(Y)0207);中国科学院信息工程研究所攀登计划项目
A Method for Detecting Human-face-tampered Videos based on Interframe Difference
ZHANG Yixuan, LI Gen, CAO Yun, ZHAO Xianfeng
(State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China 2. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100093, China)
Abstract:
With the continuous upgrade of computer hardware and the continuous development of deep learning techniques in recent years, new multimedia tampering tools make it easier for people to tamper human faces in videos. Human-face-tampered videos created with these new tools can hardly be noticed by naked eyes, thus we urgently need effective methods to detect these human-face-tampered videos. At present, popular techniques used to tamper human faces in videos mainly include the autoencoder-based Deepfake and the computer-graphics-based Face2face. We have noticed that interframe differences between human face regions in human-face-tampered videos are significantly greater than those of untampered videos, so the differences between human face images in adjacent frames of videos can be utilized as an important clue for tampering detection. In this paper, we propose a new detection framework for human-face-tampered videos based on interframe differences. We first use a detection method based on artificially designed features which is traditional, namely Local Binary Pattern(LBP)/Histogram of Oriented Gradient(HOG)-feature-based detection method to verify the effectiveness of the proposed detection framework. Then, with a deep-learning-based detection method, namely Siamese-network-based detection method, we further strengthen feature representation of human face images to improve detection performance. In FaceForensics++ dataset, the LBP/HOG-feature-based detection method can have relatively high detection accuracy; while the Siamese-network-based detection method can reach higher detection accuracy, and the method has relatively strong robustness; here, the robustness refers to that a detection method can reach relatively high detection accuracy in three different situations, they are expressing differences of human face images in adjacent frames of videos in two different ways, extracting pairs of frames in three different intervals for calculating interframe differences, the training dataset and the testing dataset have different compression rates.
Key words:  video tampering  tampering detection  interframe difference  siamese network  Deepfake  Face2face