(3.235.245.219) 您好!臺灣時間:2021/05/07 20:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:江玟萱
研究生(外文):Wen-Hsuan Chiang
論文名稱(外文):Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
指導教授:陳以錚
指導教授(外文):Yi-Cheng Chen
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:41
中文關鍵詞:神經架構搜索強化學習近端策略優化神經網絡優化機器學習
外文關鍵詞:Neural Architecture SearchReinforcement LearningProximal Policy OptimizationNeural Network OptimizationMachine Learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:53
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來機器學習越來越受大眾歡迎,造成越來越多學者、業者、工程師等都進行相關的研究與應用。只要他們對於資料不夠理解,就有可能造成資訊的誤解或是模型的偏差,因為他們抓取的特徵就是一個機器學習的指標。為了避免手動抓取特徵的上述狀況,我們可以透過機器建立神經網路。我們的研究使用預測器來構立虛擬地圖。使用此虛擬地圖來訓練代理人,讓它可以找到良好的神經網絡體結構。但是獎勵函數有一些改變,因此我們在本研究中提出了四種模型。在實驗過程中,我們分析了四種模型的每個參數的實驗結果。並意識到模型穩定性的重要性。如果模型不穩定,則獲得的正確率的差距可能太大。然而我們的模型在正確率以及穩定性方面具有良好的性能。
Abstract-- In recent years, machine learning has become more and more popular, causing more and more scholars, practitioners, and engineers to conduct related research and applications. If they don't understand the data well, it may cause misunderstanding of the information or deviation of the model, because the feature they capture is an indicator of machine learning. In order to avoid the above situation of manually grabbing features, we can build neural networks through machines. Our research uses a predictor to build a virtual map. Using this virtual map to train agents to find the good neural network architecture. But the reward function has some changes, so we proposed four models in this research. During the experiment, we analyze the experimental results of each parameter for the four models. And realize the importance of the model stability. If the model is unstable, the gap between the obtained accuracy may be too large. However, our model has a good performance in accuracy and stability.
中文摘要 ii
Abstract iii
Table of contents iv
List of Figures v
List of Tables v
1. Introduction 1
2. Related Work 5
2.1 Neural Architecture Search 5
2.2 Reinforcement Learning 7
3. Methodology 9
3.1 Model Architecture 10
3.2 Data Sampling and Map Construction 11
3.3 Predictor 12
3.4 VR-PPO 13
4. Performance Evaluation 16
4.1 Accuracy Discussion 17
4.2 Parameter Setting of Predictor 18
4.3 Parameter Setting of VR-PPO 20
4.4 Reward Discussion (Virtual verse Real) 24
5. Conclusion 31
References 32
[1] B. Zoph, V. Vasudevan, J. Shlens, and Q. Le. “Learning transferable architectures for scalable image recognition,” CVPR, 2017.
[2] Islam, B.U., Baharudin, Z., Raza, M.Q., & Nallagownden, P. “Optimization of neural network architecture using genetic algorithm for load forecasting,” 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS), 1-6, 2014.
[3] M. A. J. Idrissi et al. “Genetic algorithm for neural network architecture optimization,” 2016 3rd International Conference on Logistics Operations Management (GaL), 2016, pp. 1-4.
[4] Ramchoun, H., Idrissi, M.A., Ghanou, Y., & Ettaouil, M. “Multilayer Perceptron: Architecture Optimization and Training,” IJIMAI, 4, 26-30, 2016.
[5] E. Real, A. Aggarwal, T. Huang, and Q. Le. “Regularized evolution for image classifier architecture search,” Proceedings of the AAAI conference on artificial intelligence, Vol. 33, 2019.
[6] B. Zoph and Q. Le. “Neural architecture search with reinforcement learning,” ICLR, 2017.
[7] B. Baker, O. Gupta, N. Naik, and R. Raskar. “Designing neural network architectures using reinforcement learning,” ICLR, 2017.
[8] Bello, Irwan, Pham, Hieu, Le, Quoc V., Norouzi, Mohammad, and Bengio, Samy. “Neural combinatorial optimization with reinforcement learning,” ICLR Workshop, 2017.
[9] Guo, M., Zhong, Z., Wu, W., Lin, D., & Yan, J. “Irlas: Inverse reinforcement learning for architecture search,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp.9021-9029.
[10] R. Luo, F. Tian, T. Qin, E. Chen, and T.-Y. Liu. “Neural architecture optimization,” NeurIPS, 2018.
[11] Wang, B., Xue, B., & Zhang, M.” Particle Swarm Optimization for Evolving Deep Neural Networks for Image Classification by Evolving and Stacking Transferable Blocks,” arXiv preprint arXiv: 1907.12659, 2019.
[12] Liu, H., Simonyan, K., & Yang, Y. “Darts: Differentiable architecture search,” arXiv preprint arXiv: 1806.09055, 2018.
[13] H. Cai, L. Zhu, and S. Han. “ProxylessNAS: Direct neural architecture search on target task and hardware,” ICLR, 2019.
[14] A. Noy, N. Nayman, T. Ridnik, N. Zamir, S. Doveh, I. Friedman, R. Giryes, and L. Zelnik-Manor. “ASAP: Architecture search, anneal and prune,” arXiv preprint arXiv: 1904.04123, 2019.
[15] Q. Yao, J. Xu, W.-W. Tu, and Z. Zhu. “Efficient Neural Architecture Search via Proximal Iterations,” AAAI Conference on Artificial Intelligence, 2020.
[16] Whiteson, S., & Ciosek, K. “Expected policy gradients for reinforcement learning,” Journal of Machine Learning Research, Vol. 21, (52):1-51, 2020.
[17] Booth, J. “PPO Dash: Improving Generalization in Deep Reinforcement Learning,” arXiv preprint arXiv: 1907.06704, 2019.
[18] Hämäläinen, P., Babadi, A., Ma, X., & Lehtinen, J. “PPO-CMA: Proximal policy optimization with covariance matrix adaptation,” arXiv preprint arXiv: 1810.02541, 2018.
[19] Greige, L., & Chin, P. “Reinforcement Learning in FlipIt,” arXiv preprint arXiv: 2002.12909, 2020.
[20] Van Hasselt, H., Guez, A., & Silver, D. “Deep reinforcement learning with double q-learning,” Thirtieth AAAI conference on artificial intelligence, 2016.
[21] Lakshmanan, K. “Accelerated Reinforcement Learning,” 2017 14th IEEE India Council International Conference (INDICON), IEEE, 2017, pp. 1-4.
[22] Khadka, S., & Tumer, K. “Evolutionary reinforcement learning,” arXiv preprint arXiv: 1805.07917, 2018.
[23] Jadeja, M., Varia, N., & Shah, A. “Deep Reinforcement Learning for Conversational AI,” arXiv preprint arXiv: 1709.05067, 2017.
[24] Khandel, P., Rassafi, A. H., Pourahmadi, V., Sharifian, S., & Zheng, R. “SensorDrop: A Reinforcement Learning Framework for Communication Overhead Reduction on the Edge,” arXiv preprint arXiv: 1910.01601, 2019.
[25] Yu, P., Lee, J. S., Kulyatin, I., Shi, Z., & Dasgupta, S. “Model-based deep reinforcement learning for dynamic portfolio optimization,” arXiv preprint arXiv: 1901.08740, 2019.
[26] Dunjko, V., Taylor, J. M., & Briegel, H. J. “Advances in quantum reinforcement learning,” 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 282-287, IEEE, 2017.
[27] Faust, A., Francis, A., & Mehta, D. “Evolving rewards to automate reinforcement learning,” arXiv preprint arXiv: 1905.07628, 2019.
[28] Yingjun, P., & Xinwen, H. “Learning Representations in Reinforcement Learning: An Information Bottleneck Approach,” arXiv preprint arXiv: 1911.05695, 2019.
[29] Levy, A., Platt, R., & Saenko, K. “Hierarchical reinforcement learning with hindsight,” arXiv preprint arXiv: 1805.08180, 2018.
[30] Haj-Ali, A., Ahmed, N. K., Willke, T., Gonzalez, J., Asanovic, K., & Stoica, I. “Deep Reinforcement Learning in System Optimization,” arXiv preprint arXiv: 1908.01275, 2019.
[31] Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., & Dean, J. “Efficient neural architecture search via parameter sharing.” CoRR abs/1802.03268, 2018.
[32] Liu, C., Zoph, B., Shlens, J., Hua, W., Li, L.-J., Fei-Fei, L., Yuille, A., Huang, J., & Murphy, K. “Progressive neural architecture search.” ECCV, 2018.
電子全文 電子全文(網際網路公開日期:20230731)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔