HiveAttacker:一个针对Hive数据仓库的两阶段安全性检测方案

李文超; 李丰; 薄德芳; 周建华; 霍玮

引用本文：

李文超,李丰,薄德芳,周建华,霍玮.HiveAttacker:一个针对Hive数据仓库的两阶段安全性检测方案[J].信息安全学报,已采用 [点击复制]
Li Wenchao,Li Feng,Bo Defang,Zhou Jianhua,Huo Wei.HiveAttacker: A Two-stage Security Detecting Approach for Apache Hive[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 2124次下载 404次
HiveAttacker:一个针对Hive数据仓库的两阶段安全性检测方案
李文超¹, 李丰², 薄德芳¹, 周建华², 霍玮²
0 字体:加大+\|默认\|缩小-
(1.中国科学院大学;2.中国科学院信息工程研究所)

摘要:

大数据所蕴藏的巨大价值，使其成为当前网络攻击的重点目标之一。然而，长期以来，以Hive为代表的数据仓库及大数据处理引擎，以及其所依托的分布式处理平台，普遍重视服务的高可用性、高扩展性，未充分考虑安全性，导致在大数据的存储、处理过程中存在安全风险。本文以Hadoop平台上的Hive数据仓库及查询引擎为切入点，归纳了Hive在查询解析过程中，以及在与Hadoop平台或其它第三方组件交互过程中面临的两个主要攻击面，并针对性的设计了一个两阶段安全性检测方案。方案的第一阶段针对Hive因接收、解析用户查询所引入的攻击面，对传统模糊测试技术进行定制化扩展，重点挖掘Hive自身代码中存在的可能造成提权、授权绕过等利用效果的漏洞；第二阶段针对Hive因与其它组件交互引入的攻击面，重点检测可能通过组件间交互触发的漏洞，并进行预警。基于上述方案实现的原型工具HiveAttacker，在Hive两个历史版本及最新版本中共挖掘出8个漏洞，其中包含2个最新版本中尚未修复的漏洞，并在搭建的真实Hive运行环境中检测出因组件交互引入的安全威胁7处，验证了方案的有效性。

关键词: Apache Hive 模糊测试漏洞检测

DOI：10.19363/J.cnki.cn10-1380/tn.2023.06.13

投稿时间：2020-12-07修订日期：2021-02-09

基金项目:国家自然科学基金项目（面上项目，重点项目，重大项目）, 中国科学院先导科技专项

HiveAttacker: A Two-stage Security Detecting Approach for Apache Hive

Li Wenchao¹, Li Feng², Bo Defang¹, Zhou Jianhua², Huo Wei²

(1.University of Chinese Academy of Sciences;2.Institute of Information Engineering, Chinese Academy of Sciences)

Abstract:

Big data has immense value, which makes it one of major targets of cyber-attack. However, in a long period, Hive-represented data warehouse and big data processing engine rely highly on the distributed processing platform. Generally this formulation focuses on the availability and extension in service but ignores security and expose the storage and processing of big data to security risks. In the perspective of Hive data warehouse and query engine on Hadoop platform, we concluded two main attack surfaces Hive faces: (1) during the query compile process and (2) during the interaction process with Hadoop platform or other third-party components. Then we designed a two-stage security detecting approach. In the first stage we custom and extend the traditional fuzzing technology to detect the vulnerabilities that may lead to privilege escalation, authorization bypass etc. in Hive source code. In the second stage we focus on detecting and alerting vulnerabilities that may be triggered by Hive&#39;&#39;s interactions with other components. We implement a prototype tool HiveAttacker based on the above method. A total of 8 authorization vulnerabilities were found in the two historical and latest versions of Hive, including 2 unfixed bugs in the latest version, and 7 security threats resulting from component interactions to verify the effectiveness of the method.

Key words: Apache Hive fuzzing test vulnerability detecting