引用本文
  • 赵昌志,李运鹏,杨光,黄克振,曹雅琴,刘玉岭.基于机器学习的日志异常检测技术研究综述[J].信息安全学报,已采用    [点击复制]
  • ZHAO Changzhi,LI Yunpeng,YANG Guang,HUANG Kezhen,CAO Yaqin,LIU Yuling.A Survey: Researches on Log Anomaly Detection Techniques Based on Meachine Learning[J].Journal of Cyber Security,Accept   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 50次   下载 0  
基于机器学习的日志异常检测技术研究综述
赵昌志1, 李运鹏1, 杨光2, 黄克振3, 曹雅琴1, 刘玉岭1
0
(1.中国科学院信息工程研究所;2.西南科技大学;3.中国科学院软件研究所)
摘要:
随着网络信息系统数量的增长与规模的扩大,网络空间也面临着更多的安全威胁。这些威胁不仅包括常见的数据泄露,还涵盖了高级复杂的网络攻击,其多样性、复杂性和隐蔽性都不断提升,网络空间中呈现出攻防失衡的严峻形势。为了应对这一挑战,提高防御能力和制定有效应对策略变得至关重要。网络攻击发生时,通常会在日志中留下痕迹,通过分析日志数据,可以有效地识别这些异常行为和潜在的安全威胁。因此,对日志数据开展异常检测工作不仅是一种有效的网络安全防护手段,也已成为网络安全领域的重点研究内容之一。近年来,由于机器学习技术的赋能,日志异常检测技术发掘深层异常线索的能力得到了进一步提升,但将该类方法从研究阶段转向实际应用,仍然存在较大的阻碍。为此,本综述针对基于机器学习、深度学习的日志异常检测技术进行了全面调研,将检测流程划分为日志收集、日志解析、特征表达和异常检测四个关键阶段,并分析了各阶段采用的关键技术和面临的问题。本文着重研究了不同的检测算法,从算法适用场景和数据基础的角度出发,将现有工作分类为离散分析、序列关联分析和图关联分析的异常检测算法,深入探究各类算法应对不同网络安全挑战时的有效性。文中还总结了日志异常检测技术所面临的挑战,并探讨了未来在数据处理、算法优化以及新兴技术应用等方面可能的研究方向,为网络安全专业人员提供了重要的理论支持和实际应用指导。
关键词:  网络空间安全  日志分析  异常检测  机器学习  深度学习
DOI:
投稿时间:2024-05-17修订日期:2024-07-22
基金项目:
A Survey: Researches on Log Anomaly Detection Techniques Based on Meachine Learning
ZHAO Changzhi1, LI Yunpeng1, YANG Guang2, HUANG Kezhen3, CAO Yaqin1, LIU Yuling1
(1.Institute of Information Engineering, CAS;2.Southwest University of Science and Technology;3.Institute of Software, CAS)
Abstract:
With the increasing quantity and scope of network information systems, cyberspace faces growing security vulnerabilities. These risks include not just typical data leaks, but also advanced and complex cyberattacks, the diversity, complexity, and covertness of which are continually rising, resulting in a major imbalance between attack and defence in cyberspace. In order to overcome this challenge, it has become critical to improve the capabilities of the defensive system and establish effective reaction methods. When a network attack occurs, it typically leaves traces in the logs, and by analysing the log data, these anomalous behaviours and potential security threats can be effectively identified. As a result, anomaly detection work on log data is not only an effective method of network security protection, but it has also become an important study topic in the field of network security. The capacity of log anomaly detection approaches to identify deep anomalous hints has increased in recent years as due to the empowerment of machine learning techniques, but there are still significant barriers to taking such methods from research to practical implementations. This review therefore conducts a comprehensive research project on log anomaly detection technology based on machine learning and deep learning, divides the detection process into four key phases: log collection, log parsing, feature expression, and anomaly detection, and analyses the key technologies used and challenges encountered in each phase. This paper focuses on different detection algorithms and classifies them into three categories: anomaly detection algorithms for discrete analysis, sequence correlation analysis and graph correlation analysis. It also explores the effectiveness of various types of algorithms in dealing with different cybersecurity challenges, with a particular emphasis on the applicability scenarios and data bases involved. Furthermore, the study summarises the issues faced by log anomaly detection systems and provides insights into prospective future research directions in data processing, algorithm optimisation, and the use of new technologies. It offers significant theoretical assistance and practical direction to network security professionals.
Key words:  cyber security  log analysis  anomaly detection  machine learning  deep learning