引用本文
  • 穆嘉楠,赵艺璇,严寒,宋金峰,叶靖,李华伟,李晓维.CRYSTAL-KYBER硬件设计优化空间探索[J].信息安全学报,2021,6(6):51-63    [点击复制]
  • MU Jianan,ZHAO Yixuan,YAN Han,SONG Jinfeng,YE Jing,LI Huawei,LI Xiaowei.Optimization Space Exploration of Hardware Design for CRYSTAL-KYBER[J].Journal of Cyber Security,2021,6(6):51-63   [点击复制]
【打印本页】 【在线阅读全文】【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 1122次   下载 998 本文二维码信息
码上扫一扫!
CRYSTAL-KYBER硬件设计优化空间探索
穆嘉楠1,2, 赵艺璇1,2, 严寒1,2, 宋金峰1,2, 叶靖1,2, 李华伟1,2, 李晓维1,2
0
(1.中国科学院计算技术研究所体系结构国家重点实验室 北京 中国 100094;2.中国科学院大学计算机学院 北京 中国 101408)
摘要:
公钥密码学对全球数字信息系统的安全起着至关重要的作用。然而,随着量子计算机研究的发展和Shor算法等的出现,公钥密码学的安全性受到了潜在的极大的威胁。因此,能够抵抗量子计算机攻击的密码算法开始受到密码学界的关注,美国国家标准与技术研究院(National Institute of Standards and Technology,NIST)发起了后量子密码(Post-quantum cryptography,PQC)算法标准全球征集竞赛。在参选的算法中,基于格的算法在安全性、公钥私钥尺寸和运算速度中达到了较好的权衡,因此是最有潜力的后量子加密算法体制。而CRYSTALS-KYBER作为基于格的密钥封装算法(Key encapsulation mechanism,KEM),通过了该全球征集竞赛的三轮遴选。对于后量子密码算法,算法的硬件实现效率是一个重要评价指标。因此,本文使用高层次综合工具(High-level synthesis,HLS),针对CRYSTALS-KYBER的三个主模块(密钥生成,密钥封装和密钥解封装),在不同参数集下探索了硬件设计的实现和优化空间。作为一种快速便捷的电路设计方法,HLS可以用来对不同算法的硬件实现进行高效和便捷的探索。本文利用该工具,对CRYSTALS-KYBER的软件代码进行了分析,并尝试不同的组合策略来优化HLS硬件实现结果,并最终获得了最优化的电路结构。同时,本文编写了tcl-perl协同脚本,以自动化地搜索最优优化策略,获得最优电路结构。实验结果表明,适度优化循环和时序约束可以大大提高HLS综合得到的KYBER电路性能。与已有的软件实现相比,本文具有明显的性能优势。与HLS实现工作相比,本文对Kyber-512的优化使得封装算法的性能提高了75%,解封装算法的性能提高了55.1%。与基准数据相比,密钥生成算法的性能提高了44.2%。对于CRYSTALS-KYBER的另外两个参数集(Kyber-768和Kyber-1024),本文也获得了类似的优化效果。
关键词:  公钥密码学  后量子密码学  CRYSTALS-KYBER  高层次综合  优化设计
DOI:10.19363/J.cnki.cn10-1380/tn.2021.11.05
投稿时间:2021-09-03修订日期:2021-10-09
基金项目:本文得到了国家自然科学基金(NSFC)区域创新发展联合基金(No.U20A20202)的支持。
Optimization Space Exploration of Hardware Design for CRYSTAL-KYBER
MU Jianan1,2, ZHAO Yixuan1,2, YAN Han1,2, SONG Jinfeng1,2, YE Jing1,2, LI Huawei1,2, LI Xiaowei1,2
(1.State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100094, China;2.School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China)
Abstract:
Public key cryptography plays a vital role in the security of nowadays global digital information systems. However, with the development of quantum computing and the emergence of Shor's algorithm, the security of public key cryptography has been potentially greatly threatened. Therefore, cryptographic algorithms that can resist the attack from an adversary even has access to a quantum computer have begun to attract the attention of the cryptography community. The National Institute of Standards and Technology (NIST) has launched a global solicitation for the post-quantum cryptography algorithms standard.Among the participating algorithms, the lattice-based algorithm scheme achieves a good trade-off in security, key size, and operation speed, so it is the most potential post-quantum encryption algorithm scheme. CRYSTALS-KYBER, as a lattice-based Key Encapsulation Mechanism (KEM) algorithm, passed three rounds of the global solicitation for post-quantum cryptography algorithms standard. For post-quantum cryptographic algorithms, the hardware implementation efficiency of the algorithm is an important evaluation index. Therefore, this article explores the realization and optimization space of hardware design for the three main modules of CRYSTALS-KYBER (Key generation, key encapsulation, and key decapsulation) under different parameter sets, using the high-level synthesis tools (High-level synthesis, HLS). As a high-level hardware design method, HLS can be used to efficiently and conveniently explore the hardware implementation of different algorithms. This paper uses the HLS tools to analyze the software implementation of CRYSTALS-KYBER, and try different combination strategies to optimize the HLS hardware implementation results, and finally obtain the most optimized hardware structure. At the same time, this paper provides a tcl-perl collaboration script to automatically search for the optimal optimization strategy and obtain the optimal hardware structure. The experimental results show that the performance of the obtained hardware can be greatly improved by moderately optimizing the loops and timing constraints. In comparison with the state-of-the-art software implementation, this paper shows an obvious performance advantage. In comparison with the state-of-the-art HLS implementation, our optimizations of Kyber-512 improve the performance by up to 75% for key encapsulation algorithm and 55.1% for key decapsulation algorithm. And compared with the baseline, the performance was improved by 44.2% in the key generation algorithm. For the other two parameter sets (Kyber-768 and Kyber-1024), the same optimization effect is obtained.
Key words:  public key cryptography  post-quantum cryptography  CRYSTALS-KYBER  high-level synthesis  optimization