发表于: 2020/04/01 00:00 | 作者: NICA

本文的第一作者为博士生陈智强,题目为“Dynamical Channel Pruning by Conditional Accuracy Change for Deep Neural Networks”,发表于IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

以下为文章摘要:

Channel pruning is an effective technique that has been widely applied to deep neural network compression. However, many existing methods prune from a pretrained model, thus resulting in repetitious pruning and fine-tuning processes. In this article, we propose a dynamical channel pruning method, which prunes unimportant channels at the early stage of training. Rather than utilizing some indirect criteria (e.g., weight norm, absolute weight sum, and reconstruction error) to guide connection or channel pruning, we design criteria directly related to the final accuracy of a network to evaluate the importance of each channel. Specifically, a channelwise gate is designed to randomly enable or disable each channel so that the conditional accuracy changes (CACs) can be estimated under the condition of each channel disabled. Practically, we construct two effective and efficient criteria to dynamically estimate CAC at each iteration of training; thus, unimportant channels can be gradually pruned during the training process. Finally, extensive experiments on multiple data sets (i.e., ImageNet, CIFAR, and MNIST) with various networks (i.e., ResNet, VGG, and MLP) demonstrate that the proposed method effectively reduces the parameters and computations of baseline network while yielding the higher or competitive accuracy. Interestingly, if we Double the initial Channels and then Prune Half (DCPH) of them to baseline's counterpart, it can enjoy a remarkable performance improvement by shaping a more desirable structure.

这篇文章提出了一种动态的通道裁剪方法。不同于以往裁剪工作多需要预训练网络,并在裁剪之后进行微调来保持性能,本工作采用端到端的裁剪方式,在训练的过程中对网络进行动态的裁剪。相较于以往裁剪工作多采用经验的或者间接的准则来判断通道或连接的重要性,本工作建立了一种与最终精度直接相关的准则,通过在训练的过程中对每个通道进行随机屏蔽,以此来动态的评估每个通道对最终精度的影响,从而确定每个通道的重要性,并在训练的过程中对不重要的通道进行动态的裁剪。实验结果表明,本方法相较于之前一些经典的通道裁剪方法,能够达到更好的性能。同时,本方法还能够自动的塑造更优的网络结构,从而提升网络的性能,在CIFAR数据集ResNet网络上,相同的网络参数和计算量下,我们实验中甚至可以取得3.47%的性能提升。

文章链接:https://ieeexplore.ieee.org/document/9055425