|
[1] A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, and F. Hutter, “Understanding and robustifying differentiable architecture search,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=H1gDNyrKDS [2] G. Li, G. Qian, I. C. Delgadillo, M. Müller, A. Thabet, and B. Ghanem, “Sgas: Sequential greedy architecture search,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, pp. 1097–1105, 2012. [4] K. Simonyan and A. Zisserman, “Very deep convolution networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [5] C. Szegedy, W. Liu, Y. Ji, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” Computer Vision and Pattern Recognition (CVPR), 2015. [6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [7] S. Xie, R. Girshick, P. Doll´ar, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500. [8] J. Bergstra, D. Yamins, and D. Cox, “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures,” in Proceedings of the 30th International Conference on Machine Learning, PMLR, 2013, pp. 115–123. [9] H. Mendoza, A. Klein, M. Feurer, J. T. Springenberg, and F. Hutter, “Towards automatically-tuned neural networks,” in Proceedings of the Workshop on Automatic Machine Learning, PMLR, 2016. [10] B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” in International Conference on Learning Representations, ICLR, 2017. [11] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. [12] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolution for image classifier architecture search,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2019. [13] H. Liu, K. Simonyan, and Y. Yang, “Darts: Differentiable architecture search,” arXiv preprint arXiv:1806.09055, 2018. [14] A. Yang, P. M. Esperança, and F. M. Carlucci, “Nas evaluation is frustratingly hard,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=HygrdpVKvr [15] L. Xie, X. Chen, K. Bi, L. Wei, Y. Xu, L. Wang, Z. Chen, A. Xiao, J. Chang, X. Zhang, and Q. Tian, “Weight-sharing neural architecture search: A battle to shrink the optimization gap,” ACM Computing Surveys, 2022. [16] X. Chen, L. Xie, J. Wu, and Q. Tian, “Progressive differentiable architecture search: Bridging the depth gap between search and evaluation,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 1294–1303. [17] H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang, and Z. Li, “Darts+: Improved differentiable architecture search with early stopping,” arXiv preprint arXiv:1909.06035, 2019. [18] X. Chen and C.-J. Hsieh, “Stabilizing differentiable architecture search via perturbation-based regularization,” arXiv:2002.05283, 2021. [19] Y. Xu, L. Xie, X. Zhang, X. Chen, G.-J. Qi, Q. Tian, and H. Xiong, “PC-DARTS: Partial channel connections for memory-efficient architecture search,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=BJlS634tPr [20] X. Dong and Y. Yang, “Nas-bench-201: Extending the scope of reproducible neural architecture search,” in International Conference on Learning Representations (ICLR), 2020. [Online]. Available: https://openreview.net/forum?id=HJxyZkBKDr [21] X. Dong, L. Liu, K. Musial, and B. Gabrys, “NATS-Bench: Benchmarking nas algorithms for architecture topology and size,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [22] N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On large-batch training for deep learning: Generalization gap and sharp minima,” arXiv preprint arXiv:1609.04836, 2016. [23] T. Elsken, J. H. Metzen, and F. Hutter, “Neural architecture search: A survey,” Journal of Machine Learning Research, 2019. [24] K. Bi, C. Hu, L. Xie, X. Chen, L. Wei, and Q. Tian, “Stabilizing darts with amended gradient estimation on architectural parameters,” arXiv preprint arXiv:1910.11831, 2019. [25] L. Guilin, Z. Xing, W. Zitong, L. Zhenguo, and Z. Tong, “Stacnas: Towards stable and consistent optimization for differentiable neural architecture search,” 2019. [26] K. Bi, L. Xie, X. Chen, L. Wei, and Q. Tian, “Gold-nas: Gradual, one-level, differentiable,” arXiv preprint arXiv:2007.03331, 2020. [27] P. Hou, Y. Jin, and Y. Chen, “Single-darts: Towards stable architecture search,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 373–382. [28] X. Chu, T. Zhou, B. Zhang, and J. Li, “Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search,” in 16th Europoean Conference On Computer Vision, 2020. [Online]. Available: https://arxiv.org/abs/1911.12126.pdf [29] A. Noy, N. Nayman, T. Ridnik, N. Zamir, S. Doveh, I. Friedman, R. Giryes, and L. Zelnik, “Asap: Architecture search, anneal and prune,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2020, pp. 493–503. [30] S. Hochreiter and J. Schmidhuber, “Flat minima,” Neural computation, vol. 9, no. 1, pp. 1–42, 1997. [31] P. Foret, A. Kleiner, H. Mobahi, and B. Neyshabur, “Sharpness-aware minimization for efficiently improving generalization,” arXiv preprint arXiv:2010.01412, 2020. [32] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [33] X. Gastaldi, “Shake-shake regularization,” arXiv preprint arXiv:1705.07485, 2017. [34] Y. Mao, G. Zhong, Y. Wang, and Z. Deng, “Differentiable light-weight architecture search,” in 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2021, pp. 1–6. [35] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017. [36] F. Yu and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122, 2015. [37] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 1126–1135. [38] A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009. [39] P. Chrabaszcz, I. Loshchilov, and F. Hutter, “A downsampled variant of imagenet as an alternative to the cifar datasets,” arXiv preprint arXiv:1707.08819, 2017. [40] H. Xiao, K. Rasul, and R. Vollgraf. (2017) Fashion-mnist a novel image dataset for benchmarking machine learning algorithms. [41] Y. LeCun, C. Cortes, and C. J. Burges. The mnist database of handwritten digits. [Online]. Available: http://yann.lecun.com/exdb/mnist/ [42] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, “Random erasing data augmentation,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 13 001–13 008. [43] J. Rajasegaran, V. Jayasundara, S. Jayasekara, H. Jayasekara, S. Seneviratne, and R. Rodrigo, “Deepcaps: Going deeper with capsule networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 725–10 733. [44] A. Nøkland and L. H. Eidnes, “Training neural networks with local error signals,” in International conference on machine learning. PMLR, 2019, pp. 4839–4850. [45] G. Cohen, S. Afshar, J. Tapson, and A. Van Schaik, “Emnist: Extending mnist to handwritten letters,” in 2017 international joint conference on neural networks (IJCNN). IEEE, 2017, pp. 2921–2926. [46] P. Jeevan and A. Sethi, “Wavemix: Resource-efficient token mixing for images,” arXiv preprint arXiv:2203.03689, 2022. [47] H. Kabir, M. Abdar, S. M. J. Jalali, A. Khosravi, A. F. Atiya, S. Nahavandi, and D. Srinivasan, “Spinalnet: Deep neural network with gradual input,” arXiv preprint arXiv:2007.03347, 2020. [48] V. Jayasundara, S. Jayasekara, H. Jayasekara, J. Rajasegaran, S. Seneviratne, and R. Rodrigo, “Textcaps: Handwritten character recognition with very small datasets,” in 2019 IEEE winter conference on applications of computer vision (WACV). IEEE, 2019, pp. 254–262.
|