摘要: |
近些年来,由于互联网企业竞争激烈,各平台文本信息存在着相互恶意拦截的问题,这往往给用户带来不便甚至造成损失。目前,在中文文本信息过滤领域中,“火星文”在规避关键词屏蔽方面效果显著。然而,随着人工智能的快速发展,检测技术不断提升,仅仅依靠规避关键词屏蔽已然不足以确保文本信息传递的安全性,文本关键信息仍然存在着被拦截的风险,这是由于这类关键信息的呈现模式通常具有规律性。为了解决这类问题,本文采用了文本信息隐藏技术。鉴于传统文本隐写算法的局限性,本文提出了一种基于“火星文”生成的文本隐写系统。该文本隐写系统利用“火星文”较于传统平面媒介的语言形式而言,信息冗余度高的特点,将重要内容隐藏至文本中。该文本隐写系统主要由预处理、控制以及隐写三大基本模块组成。通过对汉字结构特征的研究以及“火星文”构字方式的分析,本文设计出了6种隐写子模块以供信息嵌入与提取。实验结果分析,所提出的隐写方案的嵌入容量高于同类型隐写方案,且具有较强的鲁棒性。此外,我们给出该文本隐写系统在互联网中的一个具体应用,从而体现其实用性。 |
关键词: 文本信息过滤 关键词屏蔽 文本信息隐藏 火星文 |
DOI:10.19363/J.cnki.cn10-1380/tn.2022.09.01 |
投稿时间:2021-09-17修订日期:2021-11-11 |
基金项目:本课题得到国家自然科学基金项目(No.62072237)的资助和江苏省自然科学基金面上项目-基于混沌压缩感知的物联网数据安全低能耗获取技术研究(No.BK20201290)的资助。 |
|
A Steganographic System based on “Martian” Generation for Avoidance of Text Information Interception |
ZHU Jiahao,ZHANG Yushu,LIU Zhe,ZHANG Xinpeng |
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;School of Computer Science, Fudan University, Shanghai 200433, China |
Abstract: |
In recent years, due to fierce competition among Internet companies, there exists a phenomenon that text information on various platforms has been maliciously intercepted by each other, which may bring inconvenience and even loss to users. Nowadays, "Martian" is effective in avoiding keyword blocking in the field of Chinese text information filtering because of the complexity of the formation of "Martian" characters and the variety of "Martian" characters. However, with the rapid development of artificial intelligence, information filtering technology has improved dramatically. Hence it is far from enough to maintain the security of text information transmission only by keyword blocking avoidance. The vital text content still has the risk of being intercepted by information filtering systems owing to the regularity of the presentation pattern of such content. To address this problem, we adopt text information hiding technology. Given that "Martian" has more redundant information than the language form of traditional plane medium and the traditional text information hiding schemas are not suitable for the avoidance of text information blocking, we select "Martian" as the steganographic carrier and propose a "Martian" generation based steganographic system. Through the combination of "Martian" and text information hiding technology, some vital and sensitive information can be free from being exposed in the text, improving the security of information transmission. The text steganographic system proposed in this paper consists of three basic modules:preprocessing module, control module, and steganographic module, where the preprocessing module is mainly responsible for the normalization of data to be embedded, the control module is in charge of task assignment during data embedding and extraction process, and the steganographic module is designed for some specific data embedding and extraction tasks. Through the research on the structural characteristics of Chinese characters and the analysis of the construction pattern of "Martian", we designed six kinds of steganographic sub modules for data embedding and extraction. Extensive experiments and theoretical analysis demonstrates that the proposed text steganographic system achieves higher embedding capacity than other similar text steganographic schemes and possesses good robustness. In addition, we present a specific Internet application example of the text steganography system based on "Martian" generation, which reflects its practicality. |
Key words: text information filtering keyword blocking text information hiding Martian |