面向PATE的隐私增强型机器学习方案

郭显; 郑凯; 薛景今; 王典冬

引用本文：

郭显,郑凯,薛景今,王典冬.面向PATE的隐私增强型机器学习方案[J].信息安全学报,已采用 [点击复制]
guoxian,zhengkai,xuejingjin,wangdiandong.Privacy enhanced machine learning scheme for PATE[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 90次下载 0次
面向PATE的隐私增强型机器学习方案

0 字体:加大+\|默认\|缩小-
(兰州理工大学)

摘要:

教师模型全体隐私聚合(Private Aggregation of Teacher Ensembles, PATE)是一个通用的机器学习隐私保护框架，虽然它为训练数据提供了一定程度上的隐私保证，但同样也带来了一些安全及隐私风险。为了使PATE具有更高的安全及隐私属性，面向PATE提出了一种隐私增强的机器学习方案。首先，方案设计了一种批量零知识证明算法来确保所有教师节点的合法身份，使用交互式零知识证明技术来实现聚合器与学生节点间的互认证，以提高验证效率，并融合同态加密技术来保护教师节点的数据隐私，在保护教师节点的数据隐私的同时使得聚合器能在密文上进行直接运算。接着，方案使用主成分分析法对教师节点所提交的预测结果进行关键向量提取，聚合器可通过向量分布图来过滤掉可能存在投毒行为的教师节点序列，以防止投毒攻击。随后，设计了一种适用于PATE结构的多密钥同态加密算法，算法中教师节点不再使用由学生节点生成的公钥进行加密，而是使用聚合公钥进行加密并解密聚合密文中自己的份额，以此来应对教师节点与学生节点可能发动合谋攻击的问题。最后，针对聚合器与学生节点交互过程的特点设计了一种可验证的不经意传输算法来保护学生节点的信息隐私。安全性分析表明方案能确保所有参与方的合法身份，保护教师节点与学生节点的隐私，并能至多抵御至多个教师节点之间的合谋攻击。实验分析表明方案能抵抗投毒攻击，并较现有方案具有更高的计算效率与更低的通信成本。

关键词: 机器学习 PATE 隐私保护安全多方计算多密钥

DOI：

投稿时间：2024-03-28修订日期：2024-06-06

基金项目:国家自然科学基金项目（面上项目，重点项目，重大项目）

Privacy enhanced machine learning scheme for PATE

guoxian¹, zhengkai², xuejingjin³, wangdiandong¹

(1.兰州理工;2.兰州理工大学;3.Lanzhou University of Technology)

Abstract:

The Private Aggregation of Teacher Ensembles (PATE) is a generic framework for privacy protection in machine learning. While it provides a degree of privacy assurance for training data, it also introduces certain security and privacy risks. To enhance the security and privacy attributes of PATE, a pri-vacy enhanced machine learning scheme is proposed for PATE. Firstly, a batch zero knowledge proof algorithm is designed to ensure the legitimate iden-tity of all teacher nodes. Interactive zero knowledge proof technology is used to achieve mutual authentication between the aggregator and student nodes to improve verification efficiency. Homomorphic encryption technology is also integrated to protect the data privacy of teacher nodes, allowing the aggregator to perform direct operations on the ciphertext while protecting their data privacy. Next, the scheme uses principal component analysis to extract key vectors from the prediction results submitted by teacher nodes. The aggregator can filter out sequences of teacher nodes that may have poisoning behavior through vector distribution maps to prevent poisoning attacks. Subsequently, a multi key homomorphic encryption algorithm suita-ble for the PATE structure was designed. In the algorithm, the teacher node no longer uses the public key generated by the student node for encryption, but instead uses the aggregated public key to encrypt and decrypt its own share in the aggregated ciphertext, in order to address the issue of possible collusion attacks between the teacher node and the student node. Finally, a verifiable unintentional transmission algorithm was designed to protect the information privacy of student nodes based on the characteristics of the interaction process between the aggregator and student nodes. Security analysis shows that the scheme can ensure the legitimate identities of all participants, protect the privacy of teacher and student nodes, and resist collusion at-tacks between up to N-1 teacher nodes. Experimental analysis shows that the scheme can resist poisoning attacks and has higher computational effi-ciency and lower communication costs compared to existing schemes.

Key words: machine learning PATE privacy protection multi-party computation multi-key