摘要: |
随着量子计算技术的高速发展,传统的公钥密码体制正在遭受破译的威胁,将现有加密技术过渡到具有量子安全的后量子密码方案上是现阶段密码学界的研究热点。在现有的后量子密码(Post-Quantum Cryptography,PQC)方案中,基于格问题的密码方案由于其安全性,易实施性和使用灵活的众多优点,成为了最具潜力的PQC方案。SHA-3作为格密码方案中用于生成伪随机序列以及对关键信息散列的核心算子之一,其实现性能对整体后量子密码方案性能具有重要影响。考虑到今后PQC在多种设备场景下部署的巨大需求,SHA-3的硬件实现面临着高性能与有限资源开销相互制约的瓶颈挑战。对此,本文提出了一种高效高速的SHA-3硬件结构,这种结构可以应用于所有的SHA-3家族函数中。首先,本设计将64 bit轮常数简化为7 bit,既减少了轮常数所需的存储空间,也降低了运算复杂度。其次,提出了一种新型的流水线结构,这种新型结构相比于通常的流水线结构对关键路径分割得更加均匀。最后,将新型流水线结构与展开的优化方法结合,使系统的吞吐量大幅提高。本设计基于XilinxVirtex-6现场可编程逻辑阵列(FPGA)完成了原型实现,结果显示,所设计的SHA-3硬件单元最高工作频率可达459 MHz,效率达到14.71 Mbps/Slice。相比于现有的相关设计,最大工作频率提高了10.9%,效率提升了28.2%。 |
关键词: 后量子密码 哈希算法 硬件实现 SHA-3 |
DOI:10.19363/J.cnki.cn10-1380/tn.2021.11.03 |
Received:August 30, 2021Revised:October 13, 2021 |
基金项目:本课题得到国家自然科学基金面上项目(No.61874163),国家自然科学基金重点项目(No.62134002)资助。 |
|
Design of High-Speed and High-Efficiency SHA-3 Hardware Unit for Post-Quantum Cryptography |
LIU Dongsheng,CHEN Yong,XIONG Siqi,YANG Shuo,HU Ang |
School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074, China |
Abstract: |
With the rapid development of quantum computing technology, traditional public-key cryptosystems are being threatened by deciphering. The transition from existing encryption technology to post-quantum cryptographic schemes with quantum security is a research hotspot in cryptography at this stage. Among the existing post-quantum cryptographic schemes, the cryptographic scheme based on lattice problem has become one of the most potential PQC schemes due to its advantages of small public key, fast speed and good diversity. As the crucial component in lattice-based PQC schemes, Secure Hash Algorithm-3 (SHA-3) is used as hash functions and extendable-output functions to generate streams of uniformly random numbers, which are then sampled as the pseudorandom matrix or noise polynomials. Considering the great demand for PQC schemes in future diversified applications, the implementation of SHA-3 faces the challenge of limited hardware resource and high performance. In this paper, an efficient hardware architecture of SHA-3 is presented, which is two times unrolled with two inside pipeline registers (IPR) inside of the transformation round and two output pipeline registers (OPR) between adjacent rounds. The proposed design can be employed in all SHA-3 modes and support both one-block and multi-block messages. Firstly, through analyzing the characteristics in detail, this paper simplified the original 64-bit round constants in ı operation of Keccak-p[1600,24] to 7 bits in order to reduce resource consumption. Secondly, a novel pipeline technique inside of the transformation round is developed. Compared with the conventional pipeline technique, it divides the critical path precisely and improves the speed while keeping the resource consumption at a low level. Thirdly, a SHA-3 architecture based on unrolling and pipelining technology is proposed. Unrolling reduces the number of cycles needed for operation. Pipelining brings the advantages of improving the speed and the amount of messages that can be hashed in parallel. With this composite architecture, the throughput can be further enhanced. To the best of our current knowledge, this paper presents the most high-speed and efficient hardware implementation of SHA-3 on Xilinx Virtex-6 FPGA: maximum frequency of 459 MHz and hardware efficiency (throughput/area) of 14.71 Mbps/Slices. When compared to the state-of-the-art related designs, our implementation can realize a 10.9% improvement in max frequency and 28.2% improvement in hardware efficiency. |
Key words: post-quantum cryptography hash algorithm hardware implementation SHA-3 |