深度学习模型可解释性的研究进展

化盈盈; 张岱墀; 葛仕明

引用本文：

化盈盈,张岱墀,葛仕明.深度学习模型可解释性的研究进展[J].信息安全学报,2020,5(3):1-12 [点击复制]
HUA Yingying,ZHANG Daichi,GE Shiming.Research Progress in the Interpretability of Deep Learning Models[J].Journal of Cyber Security,2020,5(3):1-12 [点击复制]

本文已被：浏览 28402次下载 24249次	码上扫一扫！
深度学习模型可解释性的研究进展
化盈盈^1,2, 张岱墀^1,2, 葛仕明¹
0 字体:加大+\|默认\|缩小-
(1.中国科学院信息工程研究所北京中国 100093;2.中国科学院大学网络空间安全学院北京中国 100049)

摘要:

深度学习在很多人工智能应用领域中取得成功的关键原因在于，通过复杂的深层网络模型从海量数据中学习丰富的知识。然而，深度学习模型内部高度的复杂性常导致人们难以理解模型的决策结果，造成深度学习模型的不可解释性，从而限制了模型的实际部署。因此，亟需提高深度学习模型的可解释性，使模型透明化，以推动人工智能领域研究的发展。本文旨在对深度学习模型可解释性的研究进展进行系统性的调研，从可解释性原理的角度对现有方法进行分类，并且结合可解释性方法在人工智能领域的实际应用，分析目前可解释性研究存在的问题，以及深度学习模型可解释性的发展趋势。为全面掌握模型可解释性的研究进展以及未来的研究方向提供新的思路。

关键词: 深度学习模型可解释性人工智能

DOI：10.19363/J.cnki.cn10-1380/tn.2020.05.01

投稿时间：2020-02-07修订日期：2020-04-22

基金项目:本课题得到国家自然科学基金（No.61772513）资助。

Research Progress in the Interpretability of Deep Learning Models

HUA Yingying^1,2, ZHANG Daichi^1,2, GE Shiming¹

(1.Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;2.School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China)

Abstract:

Deep learning has succeeded in many areas of artificial intelligence, and the key reason for this is to learn a wealth of knowledge from massive data through complex deep networks. However, the high degree of complexity in deep learning models often makes it difficult for people to understand the decision-making results, which makes deep learning models unexplainable and limits their practical deployment. Therefore, there is an urgent need to improve the interpretability of deep learning models and make the models transparent to promote the development of artificial intelligence. This paper aims to systematically study the research progress in the interpretability of deep learning models. And we make a new division of these interpretable methods from the perspective of interpretability principles. According to the practical application of interpretability, we analyze and summarize the problems existing in the current interpretable research and the future development trend of explainable artificial intelligence. It provides new ideas to comprehensively understand the current progress and the further direction of interpretability.

Key words: deep learning models interpretability artificial intelligence