LogCPT：基于对比预训练和混合Transformer-LSTM的日志异常检测模型

李涛; 巴凯洋; 胡爱群

引用本文：

李涛,巴凯洋,胡爱群.LogCPT：基于对比预训练和混合Transformer-LSTM的日志异常检测模型[J].信息安全学报,已采用 [点击复制]
litao,bakaiyang,huaiqun.LogCPT: A Log Anomaly Detection Model Based on Contrastive Pre-training and Hybrid Transformer-LSTM[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 374次下载 0次
LogCPT：基于对比预训练和混合Transformer-LSTM的日志异常检测模型
李涛, 巴凯洋, 胡爱群
0 字体:加大+\|默认\|缩小-
(东南大学网络空间安全学院)

摘要:

系统日志作为记录运行时状态的核心数据，其异常检测对故障诊断和系统优化具有至关重要的意义。然而，面对日志数据的语义高维性、模式动态演化及异常标签稀缺时，现有无监督方法普遍存在特征表示判别力不足、对未知模式泛化能力受限的问题。为此，本文提出了一种基于对比预训练与混合Transformer-LSTM架构的无监督日志异常检测模型——LogCPT。该模型通过构建一个“自监督语义增强→混合时序建模→几何方向感知”的系统性学习范式，重塑正常模式的特征边界，提高检测效率。首先，针对无标签场景下的特征表达瓶颈，本文引入基于InfoNCE损失的自监督对比预训练机制，通过随机掩码重建与局部乱序增强构建自监督任务，在无人工标注前提下迫使模型学习正常日志模式的本征语义流形与高判别力抗噪表征。其次，设计了Transformer-LSTM串行架构有机融合了Transformer的全局上下文聚合能力与LSTM的局部时序因果捕捉优势，增强了模型对日志流中复杂逻辑依赖的表征能力。此外，针对日志流中常见的语义漂移及噪声敏感问题，提出了方向感知的混合损失函数，将判别从传统的数值逼近提升至高维几何空间的方向对齐，并结合Top-K评分机制有效抑制局部噪声。实验结果表明，LogCPT模型相较于MTSAD等模型在HDFS、BGL以及CERT数据集上均取得了最优性能指标，且在面临局部乱序干扰、顺序违规判定以及在仅有少量训练样本的数据稀疏场景下仍能保持稳健性能，验证了该方法的高效性与鲁棒性。

关键词: 日志异常检测自监督对比预训练 Transformer LSTM 混合损失函数

DOI：

投稿时间：2025-12-16修订日期：2026-03-21

基金项目:国家自然科学基金 [52233003]；中央高校基本科研业务费[2242022k60005]

LogCPT: A Log Anomaly Detection Model Based on Contrastive Pre-training and Hybrid Transformer-LSTM

litao, bakaiyang, huaiqun

(School of Cyberspace Security, Southeast University)

Abstract:

The system log serves as the core data for recording the runtime status, and its anomaly detection is of crucial sig-nificance for fault diagnosis and system optimization. However, when confronted with the semantic high-dimensionality of log data, the dynamic evolution of patterns, and the scarcity of anomaly labels, existing un-supervised methods generally suffer from insufficient discriminative power of feature representation and limited generalization ability for unknown patterns. To address these issues, this paper proposes an unsupervised log anomaly detection model called LogCPT based on contrastive pre-training and a hybrid Transformer-LSTM archi-tecture. This model constructs a systematic learning paradigm of "self-supervised semantic enhancement → mixed temporal modeling → geometric direction perception", which reshapes the feature boundaries of normal patterns to improve detection efficiency. Firstly, for the feature expression bottleneck in the unlabeled scenario, this paper introduces a self-supervised contrastive pre-training mechanism based on the InfoNCE loss. Through random masking reconstruction and local disorder enhancement, a self-supervised task is constructed, forcing the model to learn the intrinsic semantic manifold and high-discriminative anti-noise representation of normal log patterns without manual annotation. Secondly, a Transformer-LSTM serial architecture is designed, which organically com-bines the global context aggregation ability of Transformer and the local temporal causal capture advantage of LSTM, achieving a complete characterization of complex dependencies. Moreover, for the common semantic drift and noise sensitivity problems in log streams, a direction-aware mixed loss function is proposed, which elevates the discrimination from traditional numerical approximation to direction alignment in high-dimensional geometric space, and combines the Top-K scoring mechanism to effectively suppress local noise. Experimental results on the HDFS dataset show that LogCPT outperforms the advanced MTSAD model by 6.57%, 6.82%, and 5.06% in preci-sion, recall, and F1 score, and still maintains stable performance in data-sparse scenarios with only a small number of training samples, verifying the efficiency and robustness of this method.

Key words: log anomaly detection self-supervised contrastive pre-training Transformer LSTM hybrid loss function