| 摘要: |
| 针对医疗领域中联邦学习的隐私保护、公平聚合、可验证性与用户动态退出问题,本文提出一种组内安全聚合和组间公平聚合的隐私可验证聚合方案,引入动态阈值调整策略与掩码恢复机制。首先,在组内阶段,结合动态阈值Shamir秘密共享与Diffie-Hellman密钥协商,实现了梯度掩码交换与抵消,并支持用户的动态退出,显著降低用户退出对模型性能的影响。组内聚合通过动态调整秘密共享的门限值,有效应对参与方的不确定性,确保即使部分用户离线,聚合过程仍能安全完成。在组间阶段,利用改进的ElGamal加法同态加密算法对数据质量权重加密,同时,采用了基于数据质量的多维评估体系进行权重分配,确保数据的隐私性和贡献的公平性。该权重体系在激励高质量数据贡献的同时,抑制了低质量或恶意数据对全局模型的负面影响。其次,采用轻量级同态哈希验证机制,保障聚合结果的可验证性,并通过抽样验证机制降低用户的验证成本。此外,本文构建了包含聚合服务器、组服务器、用户与可信机构的系统模型,明确了各实体的角色与安全假设,为方案的实际部署提供了清晰的框架。最后,实验结果表明,在HAM10000皮肤病变数据集与CIFAR-10通用图像数据集上,与传统的联邦平均算法及近年来多种隐私保护或公平性优化方案相比,本文方案可以更好地确保模型收敛性,在精度上有大幅提升,在用户退出比例逐渐增加的情况下仍能保持较高的模型稳定性,且在通信轮次、收敛速度方面优于同类方案,尤其适用于高度异构的医疗数据环境。同时,本方案在安全性和计算效率、通信效率之间实现了平衡,并通过分组策略与抽样验证机制,显著降低了大规模医疗联邦学习场景下的系统开销,为跨机构协作学习提供了一种安全、公平、高效且可验证的解决方案。 |
| 关键词: 联邦学习 隐私保护 公平聚合 用户动态退出 秘密共享 同态加密 权重分配 抽样验证 |
| DOI: |
| 投稿时间:2025-09-25修订日期:2025-12-31 |
| 基金项目:国家自然科学基金(62461032);甘肃省科技计划(22JR5RA158, 22JR5RA350);甘肃省高校教师创新基金项目 (2023A-041, 2023-ZD-234);兰州交通大学-天津大学联合创新基金项目(LH2024003) |
|
| A Fair Privacy Verifiable Grouping Aggregation Framework for Medical Federated Learning |
|
Li Yahong1, Jia YueQi1, Yang Xiaodong2, Niu Shufen2
|
| (1.Lanzhou Jiaotong University;2.Northwest Normal University) |
| Abstract: |
| In response to the challenges of privacy protection, fair aggregation, verifiability, and dynamic user dropout in federated learning within the medical field, this paper proposes a privacy-preserving and verifiable aggregation scheme that integrates intra-group secure aggregation and inter-group fair aggregation. The scheme introduces a dynamic threshold adjustment strategy and a mask recovery mechanism. Firstly, in the intra-group phase, by combining dynamic threshold Shamir secret sharing with Diffie-Hellman key agreement, secure gradient mask exchange and cancellation are achieved, while supporting dynamic user dropout, which significantly reduces the impact of user exits on model performance. Intra-group aggregation dynamically adjusts the threshold of secret sharing to effectively handle participant uncertainty, ensuring that the aggregation process can be completed securely even when some users are offline. In the inter-group phase, an improved ElGamal additive homomorphic encryption algorithm is utilized to encrypt data quality weights. Simultaneously, a multi-dimensional data quality evaluation system is adopted for weight allocation, ensuring both data privacy and fairness of contribution. This weighting system encourages high-quality data contributions while mitigating the negative impact of low-quality or malicious data on the global model. Secondly, a lightweight homomorphic hash verification mechanism is employed to guarantee the verifiability of aggregation results, and a sampling verification mechanism is introduced to reduce users’ verification costs. Additionally, this paper constructs a system model comprising an aggregation server, group servers, users, and a trusted authority, clarifying the roles and security assumptions of each entity, thereby providing a clear framework for the practical deployment of the scheme. Finally, experimental results on the HAM10000 skin lesion dataset and the CIFAR-10 general image dataset demonstrate that, compared to traditional Federated Averaging and various recent privacy-preserving or fairness-optimization schemes, the proposed scheme ensures better model convergence, achieves significant improvements in accuracy, maintains high model stability under increasing user dropout rates, and outperforms similar schemes in terms of communication rounds and convergence speed—particularly suited for highly heterogeneous medical data environments. Furthermore, the scheme strikes a balance among security, computational efficiency, and communication efficiency. By incorporating grouping strategies and sampling verification mechanisms, it significantly reduces system overhead in large-scale medical federated learning scenarios, offering a secure, fair, efficient, and verifiable solution for cross-institutional collaborative learning. |
| Key words: Federated Learning Privacy Protection Fair Aggregation Dynamic User Dropout Secret Sharing Homomorphic Encryption Weight Allocation Sampling Verification. |