基于强化学习自举增强的网络空间安全知识图谱实体对齐

高站超; 王鼎; 杨近朱; 周薇

引用本文：

高站超,王鼎,杨近朱,周薇.基于强化学习自举增强的网络空间安全知识图谱实体对齐[J].信息安全学报,已采用 [点击复制]
gao zhanchao,wang ding,yang jinzhu,zhou wei.Cyber Security Knowledge Graph Entity Alignment via Bootstrapping with Reinforcement Learning[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 12527次下载 4003次
基于强化学习自举增强的网络空间安全知识图谱实体对齐
高站超¹, 王鼎¹, 杨近朱², 周薇³
0 字体:加大+\|默认\|缩小-
(1.中国科学院信息工程研究所/中国科学院大学网络空间安全学院;2.国家计算机网络应急技术处理协调中心;3.中国科学院信息工程研究所)

摘要:

知识图谱为人们提供了一种直观、高效的方式来理解和利用复杂的知识信息，已被广泛应用于网络空间安全威胁分析、舆情事件预测等网络安全领域。然而，不同网络空间安全领域缺乏统一的知识图谱构建标准，导致现有工作在构建知识图谱时忽视其通用性和扩展性。因此，如何大规模整合现有的网络空间安全知识图谱是该领域的一个关键问题。实体对齐是融合知识图谱的关键任务，现有工作已经充分地探索了如何通过编码实体的语义和结构信息进行对齐。然而，这些工作依赖于大量预对齐的种子节点辅助学习，难以有效应用于预对齐种子稀缺的网络空间知识图谱的对齐任务中。为了缓解预对齐种子节点的稀疏性问题，一些工作提出了从未标注的数据中迭代式选择伪对齐样本的方式扩充训练数据。然而，这些算法在选择样本时依赖于启发式规则，无法保证伪对齐样本的结构一致性，难以有效挖掘出不同网络空间安全知识图谱间的高质量伪对齐样本。为了解决以上问题，本文针对泛用式网络空间安全知识图谱融合任务进行研究，基于知识图谱基础图结构以及语义信息，提出了一种新的基于强化学习的自举增强的实体对齐模型（Bootstrapping Entity Alignment with Reinforcement Learning，BEAR）。该模型可以利用图的结构一致性来自动选择高质量的伪对齐样本辅助对齐。我们将自举样本选择过程抽象化为序列决策问题，并设计了一个强化学习框架进行求解，从而使模型能够自动选择最有效的伪对齐样本。此外，为了充分利用知识图谱中的结构信息并保持实体和关系表示时的结构独立性，我们设计了一种新的方向关系感知图卷积网络，用于学习实体和关系表示。在四个真实数据集的实验结果表明，本文所提出的BEAR模型在实体对齐任务上的性能优于几个最先进的基线方法。

关键词: 实体对齐图神经网络强化学习网络空间安全知识图谱

DOI：10.19363/J.cnki.cn10-1380/tn.2024.08.14

投稿时间：2023-03-03修订日期：2023-03-24

基金项目:国家自然科学基金项目（面上项目，重点项目，重大项目）

Cyber Security Knowledge Graph Entity Alignment via Bootstrapping with Reinforcement Learning

gao zhanchao¹, wang ding¹, yang jinzhu², zhou wei³

(1.Institute of Information Engineering, Chinese Academy of Sciences/Academy of Cyberspace Security, University of Chinese Academy of Sciences;2.National Computer network Emergency Response Technical Team/Coordination Center of China;3.Institute of Information Engineering, Chinese Academy of Sciences)

Abstract:

Knowledge graphs provide people with an intuitive and efficient way to understand and utilize complex knowledge, and have been widely applied in cyber security fields such as network security threat analysis, public opinion event prediction, etc. However, different cyber security fields lack unified knowledge graph construction standards, resulting in existing work neglecting the generality and scalability of knowledge graphs when constructing them. Therefore, how to integrate existing cyber security knowledge graphs on a large scale is a key problem in this field. Entity alignment is as a key task for integrating knowledge graphs, existing work has fully explored how to align entities by encoding their semantic and structural information. However, these works rely on a large number of pre-aligned seed nodes to assist learning, which are difficult to effectively apply to the alignment tasks of cyber security knowledge graphs with scare pre-aligned seeds. In order to alleviate the sparsity problem of pre-aligned seed nodes, some work proposed an iterative way of selecting pseudo-aligned samples from unlabeled data to expand training data. However, these algorithms rely on heuristic rules when selecting samples, which cannot guarantee the structural consistency of pseudo-aligned samples, making it difficult to effectively mine high-quality pseudo-aligned samples between different cyber security knowledge graphs. To solve the above problems, this paper studies the general-purpose cyber security knowledge graph fusion task. Based on the basic graph structure and semantic information of the knowledge graph, we propose a new Bootstrapping Entity Alignment with Reinforcement Learning (BEAR) model. The model can use the structural consistency of the graph to automatically select high-quality pseudo-aligned samples for alignment assistance. We abstracted the bootstrapping sample selection process as a sequential decision problem and designed a reinforcement learning framework for solving it so that the model can automatically select the most effective pseudo-aligned samples. In addition, in order to make full use of the structural information in the knowledge graph and maintain structural independence when representing entities and relationships we designed a new directional relationship-aware graph convolutional network for learning entity and relationship representations. Experimental results on four real-world datasets show that our proposed BEAR model outperforms several state-of-the-art baseline methods on entity alignment tasks.

Key words: entity alignment, graph neural network, reinforcement learning, cyber security knowledge graph