| 摘要: |
| 网络公害治理一直是信息内容安全从业者面临的重要课题。海量的网络公害信息的传播不仅污染了互联网环境, 也阻碍了社会的健康发展。因此, 打击网络公害行为成为国家网络安全事业的重点防线。然而, 流量的加密化和客户端地址更迭问题给网络公害行为的分析带来了巨大的挑战, 公害用户主体的定位和公害行为的追踪溯源在流量场景下都难以实现。针对上述问题, 在本文中, 我们提出了一种加密流量下网络公害行为主体关联技术。方法提取每个客户端地址一段时间的流量特征作为地址的网络行为知识图, 并基于图神经网络和孪生网络构建地址关联模型PolluTracker, 实现网络公害用户的地址关联和长期溯源分析工作。我们在5个月的真实用户流量数据集上进行了广泛的实验, 结果表明, 方法能够以99%的准确率实现公害主体的地址关联工作, 相比现有的四种关联方法最多提升了0.90倍。消融实验、对抗实验、实际案例分析等多项测试表明, 我们的方法能够有效实现目标公害用户的长期行为关联分析工作, 并且关联效果兼具鲁棒性和逃逸对抗能力。 |
| 关键词: 网络公害 网络行为分析 加密流量 图表示学习 度量学习 |
| DOI:10.19363/J.cnki.cn10-1380/tn.2025.09.04 |
| 投稿时间:2024-02-05修订日期:2024-04-02 |
| 基金项目:本课题得到国家重点研发计划项目“加密流量中网络公害检测与行为识别、处置研究” (No. 2021YFB3101400)资助。 |
|
| Correlation Technology between Cyber Pollution Behaviors and Subjects under Encrypted Traffic |
| CUI Tianyu,HOU Chengshang,LIU Chang,SHI Junzheng,GOU Gaopeng,XIONG Gang |
| Zhongguancun Laboratory, Beijing 100094, China;Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China |
| Abstract: |
| The governance of cyber pollution remains a pivotal challenge for those working within the realm of information content security. The pervasive spread of vast quantities of harmful information across the internet not only contaminates the digital landscape but also poses a significant barrier to societal progress. Hence, tackling cyber harmful activities has emerged as a forefront concern within national cybersecurity efforts. However, the encryption of traffic and the constant shifting of client addresses introduce formidable obstacles to analyzing and mitigating cyber pollution. Pinpointing the perpetrators and tracing the origins of these activities become increasingly complex in the context of encrypted traffic. In response to these complications, in this paper, we propose an innovative method for linking actors behind cyber pollution in the context of encrypted internet traffic. Our approach involves harvesting the traffic characteristics associated with each client address over a certain period to create a network behavior knowledge graph for that address. Utilizing this graph, we develop an association model named PolluTracker, which leverages the capabilities of Graph Neural Networks and the framework of Siamese Networks. This model aims to facilitate the association of addresses linked to cyber nuisances and supports the ongoing analysis of their source. Our extensive experimental analysis, conducted over five months using a dataset of real user traffic, indicates that our method can correlate harmful entities with a remarkable 99% accuracy rate. This performance significantly exceeds that of four existing correlation techniques by up to 0.90 times. Moreover, through a series of experiments including ablation studies, adversarial tests, and real-world scenario analysis, our method has demonstrated its effectiveness in conducting long-term behavioral correlation analysis of targeted harmful users. Notably, our approach stands out for its robustness and its adeptness at evading adversarial efforts, marking a significant advancement in the field of cybersecurity and digital environment protection. |
| Key words: cyber pollution network behavior analysis encrypted traffic graph representation learning metric learning |