引用本文: |
-
陈亚青,叶宇桐,张敏,舒波文.SDLDP:一种支持数据敏感分级的本地差分隐私框架[J].信息安全学报,已采用 [点击复制]
- CHENYAQING,YEYUTONG,ZHANGMIN,SHUBOWEN.SDLDP: A Local Differential Private Framework for Multi-level Sensitive Data[J].Journal of Cyber Security,Accept [点击复制]
|
|
摘要: |
在当今大数据时代,人们在日常生活中产生的数据规模空前庞大。基于用户数据的分析与应用为各行各业的发展提供了有力支持,同时也引发了公众对隐私泄露的担忧。本地差分隐私模型常用于数据统计任务中保护用户的隐私数据,通过为真实数据添加随机噪声,降低隐私泄露风险。然而本地差分隐私模型的高可用性伴随着对大规模数据以及较高隐私预算的依赖,隐私性和可用性之间更优的平衡仍待挖掘。本文根据数据的取值自然拥有不同敏感级别的特性,提出了一种支持数据敏感分级的本地差分隐私框架SDLDP,通过对不同取值的数据提供不同程度的隐私保护,针对性地降低低敏感数据的本地差分隐私噪声添加量,实现更高的数据可用性。进一步地,本文提出了基于该框架的两种机制:SDGRR和SDPM。SDGRR优化了本地差分隐私的经典离散型扰动机制GRR,适用于频率估计任务。SDPM对本地差分隐私的连续型扰动机制PM进行优化,经过EM算法后处理,可高效地估计数据均值。实验结果表明,与原始LDP机制相比,本文提出的两种机制显著提高了频率估计和均值估计结果的准确性。 |
关键词: 本地差分隐私 隐私保护 均值估计 频率估计 EM算法 |
DOI: |
投稿时间:2023-10-24修订日期:2024-01-17 |
基金项目:国家重点研发计划 |
|
SDLDP: A Local Differential Private Framework for Multi-level Sensitive Data |
CHENYAQING, YEYUTONG, ZHANGMIN, SHUBOWEN
|
(Institute of Software Chinese Academy of Sciences) |
Abstract: |
In the era of Big Data, the scale of data generated by the public in their daily lives is enormous as never before. The analysis and applications based on the users’ data have supported the development of various industries and have also raised public concerns about privacy violations. The local differential privacy model is commonly used in statistical tasks to protect users' private data. It reduces the risk of privacy leakage by adding random noise to the real data. However, the high utility of the local differential privacy model accompanies the reliance on large-scale data as well as the high privacy budget, and thus a better balance between privacy and utility remains to be explored. In this paper, we exploit the property that data values have different sensitivity levels and propose a novel local differential privacy framework called SDLDP which supports the data with different sensitivity levels. The basic idea of this framework is to provide different levels of privacy protection for data with different sensitivity levels and reduce the amount of noise added for less sensitive data to improve the utility of noisy data. In addition, two mechanisms based on this framework are proposed in this paper: SDGRR and SDPM. SDGRR optimizes the classical discrete perturbation mechanism GRR in local differential privacy and it is applied to frequency estimation tasks. SDPM optimizes the continuous perturbation mechanism PM in local differential privacy and it is post-processed by the EM algorithm to efficiently estimate the mean values. The experimental results demonstrate that the two mechanisms proposed in this paper significantly enhance the accuracy of the frequency estimation results and the mean estimation results respectively compared with the state-of-the-art based on the local differential privacy model. |
Key words: local differential privacy privacy protection mean estimation frequency estimation EM algorithm |