基于域适应的电磁泄漏还原图像中文文本识别

吕志强; 于超; 李海洋; 张宁

引用本文：

吕志强,于超,李海洋,张宁.基于域适应的电磁泄漏还原图像中文文本识别[J].信息安全学报,已采用 [点击复制]
lvzhiqiang,yuchao,lihaiyang,zhangning.Chinese Text Recognition in Electromagnetic Emission Reconstructed Images Based on Domain Adaptive[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 2195次下载 465次
基于域适应的电磁泄漏还原图像中文文本识别
吕志强, 于超, 李海洋, 张宁
0 字体:加大+\|默认\|缩小-
(中国科学院信息工程研究所)

摘要:

摘要计算机显示系统会在信息的传输和显示过程中产生电磁泄漏，然而通过接收机接收的电磁泄漏的视频信号信噪比很低，这使得还原的图像难以进行有效的文本识别。现有的针对低信噪比中文文本图像的文字识别工作非常少。在本文中，我们提出了一种基于域适应思想的CRNN（Convolutional Recurrent Neural Network）文字识别模型。该模型用电磁泄漏环境下采集的无标注文本图像作为目标域数据，正常的带标注文本图像作为源域数据，将卷积神经网络（Convolutional Neural Network, CNN）结合上域判别模块（Domain Discrimination Module, DDM），然后采用半监督学习的训练方式使得卷积神经网络最大化的提取到文本图像中与随机噪声无关的字符特征，提升了真实噪声环境条件下的文字识别准确率。本文模型在电磁泄漏还原实景下的公开数据集RCTW17、CASIA-10k上进行了测试，相比于主流的识别模型，基于域适应的CRNN对于电磁泄漏还原的文本图像的中文识别率有了明显的提升。摘要计算机显示系统会在信息的传输和显示过程中产生电磁泄漏，然而通过接收机接收的电磁泄漏的视频信号信噪比很低，这使得还原的图像难以进行有效的文本识别。现有的针对低信噪比中文文本图像的文字识别工作非常少。在本文中，我们提出了一种基于域适应思想的CRNN（Convolutional Recurrent Neural Network）文字识别模型。该模型用电磁泄漏环境下采集的无标注文本图像作为目标域数据，正常的带标注文本图像作为源域数据，将卷积神经网络（Convolutional Neural Network, CNN）结合上域判别模块（Domain Discrimination Module摘要计算机显示系统会在信息的传输和显示过程中产生电磁泄漏，然而通过接收机接收的电磁泄漏的视频信号信噪比很低，这使得还原的图像难以进行有效的文本识别。现有的针对低信噪比中文文本图像的文字识别工作非常少。在本文中，我们提出了一种基于域适应思想的CRNN（Convolutional Recurrent Neural Network）文字识别模型。该模型用电磁泄漏环境下采集的无标注文本图像作为目标域数据，正常的带标注文本图像作为源域数据，将卷积神经网络（Convolutional Neural Network, CNN）结合上域判别模块（Domain Discrimination Module, DDM），然后采用半监督学习的训练方式使得卷积神经网络最大化的提取到文本图像中与随机噪声无关的字符特征，提升了真实噪声环境条件下的文字识别准确率。本文模型在电磁泄漏还原实景下的公开数据集RCTW17、CASIA-10k上进行了测试，相比于主流的识别模型，基于域适应的CRNN对于电磁泄漏还原的文本图像的中文识别率有了明显的提升。, DDM），然后采用半监督学习的训练方式使得卷积神经网络最大化的提取到文本图像中与随机噪声无关的字符特征，提升了真实噪声环境条件下的文字识别准确率。本文模型在电磁泄漏还原实景下的公开数据集RCTW17、CASIA-10k上进行了测试，相比于主流的识别模型，基于域适应的CRNN对于电磁泄漏还原的文本图像的中文识别率有了明显的提升。

关键词: 电磁泄漏，文本识别，域适应，半监督学习，神经网络

DOI：10.19363/J.cnki.cn10-1380/tn.2023.08.03

投稿时间：2020-12-24修订日期：2021-03-04

基金项目:

Chinese Text Recognition in Electromagnetic Emission Reconstructed Images Based on Domain Adaptive

lvzhiqiang, yuchao, lihaiyang, zhangning

(Institute of Information Engineering，Chinese Academy of Sciences)

Abstract:

Abstract Electromagnetic emission exists in the process of information transmission and display in computer display system. However, the signal-to-noise ratio of the emitted video signal received by the receiver is very low, and it makes the restored image difficult for effective text recognition. There are few text recognition methods for Chinese text images with low signal-to-noise ratio. In this paper, We propose a CRNN (Convolutional Recurrent Neural Network) text recognition model based on domain adaptation, which uses the unlabeled text images collected in the electromagnetic emission environment as the target domain data, and uses the normal labeled text images as the source domain data. The model combines the Convolutional Neural Network (CNN) with the Domain Discrimination Module(DDM), and then use the semi-supervised learning method to maximize the extraction of the character features that are not related to random noise in the text image by the convolutional neural network, which improves the accuracy of text recognition in images emitted from target computer. Experiments were performed on dataset emitted from target computer consists of public datasets including RCTW17 and CASIA-10k. Result shows that our method outperforms common recognition methods.

Key words: Electromagnetic emission, text recognition, domain adaptive, few-shot learning, neural network