CryptFormer：实现大型Transformer模型的高效安全推理

周胤昌; 徐海霞; 王明生; 廖慧梅; 唐锦凌

引用本文：

周胤昌,徐海霞,王明生,廖慧梅,唐锦凌.CryptFormer：实现大型Transformer模型的高效安全推理[J].信息安全学报,已采用 [点击复制]
Zhou Yinchang,Xu Haixia,Wang Mingsheng,Liao Huimei,Tang Jinling.CryptFormer: Empowering Efficient and Secure Inference for Large-scale Transformers[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 788次下载 0次
CryptFormer：实现大型Transformer模型的高效安全推理
周胤昌, 徐海霞, 王明生, 廖慧梅, 唐锦凌
0 字体:加大+\|默认\|缩小-
(中国科学院信息工程研究所)

摘要:

基于Transformer的大语言模型(LLMs)在自然语言处理和计算机视觉领域的各种任务中都取得了巨大的成功。有一些解决方案利用密码学原语来保护LLMs推理服务中用户和服务器的数据隐私。然而LLMs中使用了大量高维矩阵乘法和复杂非线性激活函数，使得这些推理方案表现出较高的推理延迟和较大的通信开销。为了解决这一挑战，本文提出了基于几种新颖协议的CryptFormer框架，旨在为大型Transformer提供快速且准确的安全推理。具体来说，本文分别基于同态加密和向量茫然线性评估设计了两套计算高效且通信友好的大型矩阵乘法协议。本文还利用分段多项式最优逼近为复杂非线性函数设计了准确且高效的计算协议。在各种数据集上对不同LLMs的端到端安全推理实验表明：相比于Iron (NeurIPS’22)，CryptFormer实现了4.7~8.3倍的推理加速并将通信开销降低了75%~89%。

关键词: 大型Transformer模型安全推理矩阵乘法非线性函数安全计算

DOI：

投稿时间：2024-11-15修订日期：2025-03-05

基金项目:国家重点研发计划(No. 2020YFA0712303)和中国科学院战略性先导专项(XDB0690200)

CryptFormer: Empowering Efficient and Secure Inference for Large-scale Transformers

Zhou Yinchang, Xu Haixia, Wang Mingsheng, Liao Huimei, Tang Jinling

(INSTITUTE OF INFORMATION ENGINEERING,CAS)

Abstract:

Large Language Models (LLMs) based on Transformers have achieved significant success in various tasks within the fields of Natural Language Processing (NLP) and Computer Vision (CV). There are some solutions utilizing crypto-graphic primitives to protect the data privacy of users and servers in LLMs inference services. However, these works ex-hibit high latency and communication overhead when applied to LLMs that utilize extensive high-dimensional matrix multiplication and complex nonlinear activation functions. To address this challenge, we propose the CryptFormer framework based on several novel protocols, aiming to enable fast and accurate secure inference for large Transformers. Specifically, we develop two computationally efficient and communication-friendly 2PC protocols for large-scale matrix multiplication based on homomorphic encryption (HE) and vector oblivious linear evaluation (VOLE). We also design accurate and efficient protocol by approximating complex activation functions with optimal segmented polynomials. End-to-end secure inference experiments on various datasets with different LLMs demonstrate that CryptFormer achieves 4.7~8.3 inference speedup compared to Iron (NeurIPS’22) and reduces communication overhead by 75%~89%.

Key words: large-scale transformers secure inference matrix multiplication non-linear functions secure computation