|
[1]Yihui He, Xiangyu Zhang, and Jian Sun, "Channel Pruning for Accelerating Very Deep Neural Networks," in Proceedings of IEEE International Conferennce on Computer Vision (ICCV), 2017. [2]Song Han, Huizi Mao, and William J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," in Proceedings of the International Conference on Learning Representations (ICLR), 2016 [3]Hao Li, Asim Kadav, Igor Durdanovic, et al., "Pruning Filters for Efficient ConvNets," in Proceedings of the International Conference on Learning Representations (ICLR), 2017 [4]Liangzhen Lai, Naveen Suda, and Vikas Chandra, "Not All Ops Are Created Equal!," in Proceedings of the Conference on Systems and Machine Learning (SysML), 2018. [5]Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, et al., "ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design," in Proceedings of the Conference on European Conference on Computer Vision (ECCV), 2018 [6]Tien-Ju Yang, Andrew Howard, Bo Chen, et al., "NetAdapt: Platform-Aware Neural NetworkAdaptation for Mobile Applications," in Proceedings of the Conference on European Conference on Computer Vision (ECCV), 2018 [7]Andrew G. Howard, Menglong Zhu, Bo Chen, et al., "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861 (2017) [8]Mark Sandler, Andrew Howard, Menglong Zhu, et al., "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [9]Zhou Fang, Tong Yu, Ole J. Mengshoel, et al., "QoS-Aware Scheduling of Heterogeneous Servers for Inference in Deep Neural Networks," in Proceedings of the ACM on Conference on Information and Knowledge Management (CIKM), 2017 [10]NVIDIA Triton Inference Server, URL: https://developer.nvidia.com/nvidia-triton-inference-server [11]H. Yin, P. Molchanov, J. Alvarez, et al., "Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion," in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [12]Hanting Chen, Yunhe Wang, Chang Xu, et al., "Data-Free Learning of Student Networks," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019 [13]Jianping Gou, Baosheng Yu, Stephen John Maybank, et al., “Knowledge Distillation: A Survey,” arXiv preprint arXiv:2006.05525 (2020). [14]Geoffrey Hinton, Oriol Vinyals, and Jeff Dean, “Distilling the Knowledge in a Neural Network,” in Proceedings of the Workshop on Neural Information Processing Systems (NIPS), 2014.. [15]Yoshua Bengio, Aaron Courville, and Pascal Vincent. (2013). “Representation learning: A review and new perspectives,” IEEE TPAMI35(8):1798–1828.. [16]Junho Yim, Donggyu Joo, Jihoon Bae et al., “A gift from knowledge distillation: Fast optimization, network minimization and transfer learning,” in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2017.. [17]Xiaoliang Dai, Peizhao Zhang, Bichen Wu, et al., "ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [18]Yihui He, Ji Lin, Zhijian Liu, et al., "AMC: AutoML for Model Compression and Acceleration on Mobile Devices," in Proceedings of the European Conference on Computer Vision (ECCV), 2018 [19]Bichen Wu, Xiaoliang Dai, Peizhao Zhang, et al., "FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [20]Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, et al., "DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures," in Proceedings of the Conference on European Conference on Computer Vision (ECCV), 2018 [21]Yanqi Zhou, Siavash Ebrahimi, Sercan Ö. Arık et al., "Resource-Efficient Neural Architect," arXiv preprint arXiv:1806.07912 (2018). [22]T. Elsken, J. H. Metzen, and F. Hutter, "Multi-objective architecture search for cnns," arXiv preprint arXiv:1804.09081 (2018). [23]Mingxing Tan, Bo Chen, Ruoming Pang, et al., "MnasNet: Platform-Aware Neural Architecture Search for Mobile," in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [24]N. Gunantara, "A review of multi-objective optimization: Methods and its applications," Cogent Eng., vol. 5, no. 1, 2018, Art. no. 1502242. [25]Kalyanmoy Deb, "Multi-objective optimization. Search methodologies,", 2014. [26]Donald R Jones, Matthias Schonlau, and William J Welch, "Efficient global optimization of expensive black-box functions," Journal of Global optimization, 13(4):455–492, 1998. [27]J. Bergstra, D. Yamins, and D. D. Cox, "Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures," in Proceedings of the 30th International Conference on Machine Learning (ICML), 2013 [28]David E. Goldberg and John H. Holland, "Genetic Algorithms and Machine Learning," Machine Learning 3, 95–99 (1988). https://doi.org/10.1023/A:1022602019183 [29]Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015. [30]Suraj Srinivas, R. Venkatesh Babu, "Data-free parameter pruning for Deep Neural Networks," in Proceedings of the British Machine Vision Conference (BMWC), 2015 [31]Xin Li, Shuai Zhang, Bolan Jiang, et al., "DAC: Data-free Automatic Acceleration of Convolutional Networks," in Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 [32]Markus Nagel, Mart van Baalen, Tijmen Blankevoort, et al., "Data-free quantization through weight equalization and bias correction," in Proceedings of the International Conference on Computer Vision (ICCV), 2019 [33]Raphael Gontijo Lopes, Stefano Fenu, and Thad Starner, "Data-Free Knowledge Distillation for Deep Neural Networks," in Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2017 [34]Matan Haroush, Itay Hubara, Elad Hoffer, et al., "The Knowledge Within: Methods for Data-Free Model Compression," arXiv preprint arXiv:1912.01274, 2019. [35]Yonathan Aflalo, Asaf Noy, Ming Lin, et al., "Knapsack Pruning with Inner Distillation," arXiv preprint arXiv:2002.08258, 2020. [36]NVIDIA Nsight Compute Tool, URL: https://developer.nvidia.com/nsight-compute. [37]Alex Krizhevsky, "Learning Multiple Layers of Features from Tiny Images," Technical Report TR-2009, University of Toronto, Toronto (2009). [38]Kaiming He, Xiangyu Zhang, Shaoqing Ren, et al., "Deep Residual Learning for Image Recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [39]NVIDIA TenosrRT, URL: https://developer.nvidia.com/tensorrt [40]NVIDIA cuDNN, URL: https://developer.nvidia.com/cudnn [41]NVIDIA cuBLAS, URL: https://developer.nvidia.com/cublas
|