电磁泄漏还原图像中的中文文本识别技术研究

吕志强; 张磊; 夏宇琦; 张宁

引用本文：

吕志强,张磊,夏宇琦,张宁.电磁泄漏还原图像中的中文文本识别技术研究[J].信息安全学报,2021,6(3):212-226 [点击复制]
LV Zhiqiang,ZHANG Lei,XIA Yuqi,ZHANG Ning.Chinese Text Recognition in Electromagnetic Emission Reconstructed Images[J].Journal of Cyber Security,2021,6(3):212-226 [点击复制]

本文已被：浏览 7760次下载 6193次	码上扫一扫！
电磁泄漏还原图像中的中文文本识别技术研究
吕志强^1,2, 张磊^1,2, 夏宇琦^1,2, 张宁¹
0 字体:加大+\|默认\|缩小-
(1.中国科学院信息工程研究所第四研究室北京中国 100093;2.中国科学院大学网络空间安全学院北京中国 100093)

摘要:

现代计算机的显示信号传输过程存在的电磁泄漏，从电磁泄漏还原得到的图像会受到噪声的严重污染，使得其中的文本内容难以识别。本文提出了一种新的模型，利用基于特征强化的神经网络（Feature Enhancement based Neural Network，FENN）对电磁泄漏还原图像中的中文文本进行识别。模型将去噪自编码器（Denoising Autoencoder，DAE）与卷积神经网络（ConvolutionalNeural Network，CNN）相结合，对电磁泄漏图像的文本特征进行强化并抑制噪声干扰，在不损失原始图像信息的情况下将鲁棒特征送入后续的循环神经网络（Recurrent Neural Network，RNN），最后将连续时间序列分类（Connectionist Temporal ClassificationLoss，CTC Loss）损失与均方误差损失（Mean Squared Error Loss）结合形成联合损失对模型进行联合训练，实现无需去噪等常规预处理的中文文本识别。模型在电磁泄漏还原实景数据和公开数据集RCTW17、CASIA-10k上进行了测试，相比于常见的主流识别模型，FENN在电磁泄漏还原图像中的中文识别率最高提升5.4%，体现出明显优势。

关键词: 电磁泄漏去噪自编码器特征强化中文文本识别神经网络

DOI：10.19363/J.cnki.cn10-1380/tn.2021.05.14

投稿时间：2019-07-10修订日期：2019-10-18

基金项目:本课题得到国家重点研发计划课题（No.2018YFF01014303）资助。

Chinese Text Recognition in Electromagnetic Emission Reconstructed Images

LV Zhiqiang^1,2, ZHANG Lei^1,2, XIA Yuqi^1,2, ZHANG Ning¹

(1.The 4th Laboratory, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;2.School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100093, China)

Abstract:

Electromagnetic emission exists in the process of display signal transmission in modern computers. Therefore, by signal receiving and restoring using eavesdroppers, one can reconstruct the display information emitted from target computer. However, reconstructed images are corrupted by noise, causing difficulty in recognizing its content. In this paper, we propose a new model, using feature-enhancing-based Neural Network (FENN) to recognizes Chinese text lines in reconstructed image. The model combines Convolutional Neural Network(CNN) with denoising autoencoder to achieve enhancement of text features and suppress noise interference. Then robustic features extracted with image information preserved are feed into the following Recurrent Neural Network(RNN). Finally, with Connectionist temporal classification (CTC) Loss and Mean Squared Error(MSE) loss combined, the model can by trained jointly under a joint loss function, by which the model is able to recognize Chinese text lines in reconstructed images without denoising or any other preprocessing. Experiments were performed on dataset consists of reconstructed images and public datasets including RCTW17 and CASIA-10k. Result shows that our method outperforms common recognition methods by 5.4% at most.

Key words: electromagnetic emission denoising autoencoder feature enhancement chinese text recognition neural network