引用本文: |
-
季飞,雷正朝,吴越梅,张行,魏冬,黄伟庆.基于移动通信空口物理层流量元信息的开集识别技术[J].信息安全学报,已采用 [点击复制]
- jifei,leizhengchao,wuyuemei,zhanghang,weidong,huangweiqing.Open-Set Recognition Technology for Air Interface Traffic in Mobile Communication Based on Physical Layer Traffic Metadata[J].Journal of Cyber Security,Accept [点击复制]
|
|
本文已被:浏览 52次 下载 0次 |
|
基于移动通信空口物理层流量元信息的开集识别技术 |
季飞1,2, 雷正朝3, 吴越梅4, 张行1,2, 魏冬1,2, 黄伟庆1,2
|
|
(1.中国科学院信息工程研究所;2.中国科学院大学网络空间安全学院;3.国家计算机网络应急技术处理协调中心;4.航天长征火箭技术有限公司) |
|
摘要: |
移动通信流量分析技术通过采集用户与网络交互的流量数据,并基于特征挖掘和模式分析方法,实现网络状态及用户行为的感知。然而,非合作场景中难以捕获网络层流量,且解析链路层流量复杂度高,导致获取有效信息用于移动通信流量分析面临巨大挑战。同时,实际应用中数据采集难以涵盖所有流量类别,致使传统闭集分类模型将未知类别误判为已知类别的概率显著增加,因此流量分析方法存在开集识别挑战。针对上述问题,本文提出基于移动通信空口流量元信息的开集识别技术,将流量数据源从传统的网络层与链路层转移到物理层,并且具有识别未知流量能力。首先,针对移动通信空口物理层流量数据获取,本文提出物理层流量元信息表征方法,在不解析协议前提下,可通过信号时频分析与滤波处理提取流量元信息;其次,面向开集假设下的流量特征空间构建问题,本文提出基于三元组网络的流量编码方案,引入度量学习思想,自适应学习类内高度聚集、类间边界明显的编码特征空间。结合数据增强代理任务,可有效提升模型对抗噪声、数据局部缺失的鲁棒性。进一步,通过随机负类抽样策略可大幅降低模型训练开销。最后,为突破传统基于距离的方案依赖聚类边界形状的限制,本文提出基于密度的编码分类方案实现流量的开集识别,具有更优的边界样本分类能力。针对提出的流量分析技术,本文在22类APP所构成的流量数据集上进行了充分实验,并通过信息熵定量验证了链路层、物理层流量元信息存在映射关系。实验结果表明,本文所提方案有效降低了空口流量分析任务中元信息获取难度,且同时具备目标类别识别和未知类别检测能力。相较传统分类网络架构,三元组网络编码识别架构可以有效解决开集识别问题,更符合现实场景中的应用需求。 |
关键词: 空口流量分析 三元组网络 深度学习 开集识别 |
DOI: |
投稿时间:2025-03-10修订日期:2025-06-11 |
基金项目:国家重点研发计划 |
|
Open-Set Recognition Technology for Air Interface Traffic in Mobile Communication Based on Physical Layer Traffic Metadata |
jifei1,2, leizhengchao3, wuyuemei4, zhanghang1,2, weidong1,2, huangweiqing1,2
|
(1.Institute of Information Engineering,CAS;2.School of Cyber Security, University of Chinese Academy of Sciences;3.National Computer Network Emergency Response Technical Team/Coordination Center of China;4.Beijing Institute of Telemetry Technology) |
Abstract: |
Mobile communication traffic analysis techniques collect traffic data from user-network interactions and employ feature mining and pattern analysis to perceive network conditions and user behaviors. However, in non-cooperative scenarios, it's difficult to capture network layer traffic and the parsing of data link layer traffic is highly complex. This makes obtaining effective information for mobile traffic analysis rather challenging. Meanwhile, in practical applications, data collection can't cover all traffic categories. As a result, traditional closed-set classification models are much more likely to misclassify unknown categories as known ones. All these present open-set recognition challenges for traffic analysis methods. To address the above issues, this paper proposes an open-set recognition technology for mobile traffic analysis based on the metadata of mobile communication air interface traffic. It shifts the traffic data source from the traditional network and link layers to the physical layer, thus endowing the system with the capability to identify unknown traffic. Firstly, for obtaining the traffic metadata, this paper proposes a representation method of physical layer traffic metadata. Without parsing the protocol, traffic metadata can be extracted through time frequency analysis and filtering. Secondly, to build a traffic feature space under the open-set assumption, a traffic encoding scheme based on triplet networks is proposed. Introducing the idea of metric learning, it adaptively learns an encoded feature space where intra-class features are highly compact and inter-class boundaries are clear. The scheme's effectiveness is enhanced by a data augmentation proxy task, which boosts the model's robustness against noise and local data missingness. Moreover, the random negative sampling strategy significantly reduces model training costs. Finally, to overcome the limitations of traditional distance-based methods that rely on cluster boundary shapes, a density-based encoding classification scheme is proposed for open-set recognition, offering better classification of boundary samples. To evaluate the performance of proposed technology, we collect a traffic dataset comprising 22 categories of APPs. Through quantitative entropy verification, a mapping relationship between data link layer and physical layer traffic metadata was confirmed. Results show that the scheme can identify known classes and detect unknown ones. It eases metadata acquisition in open interface traffic analysis. The triplet network based encoding and recognition architecture effectively addresses open-set classification task, and better meeting real world application needs. |
Key words: Air interface traffic analysis Triplet network Deep learning Open-Set recognition |