(3.239.192.241) 您好!臺灣時間:2021/03/02 19:14
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:賴家弘
研究生(外文):Chia-Hung Lai
論文名稱:加強式學習控制系統應用於金融市場操作
論文名稱(外文):A Reinforcement Learning Control System for Financial Speculation
指導教授:周志成周志成引用關係
指導教授(外文):Chi-Cheng Jou
學位類別:碩士
校院名稱:國立交通大學
系所名稱:電機與控制工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1999
畢業學年度:87
語文別:中文
論文頁數:51
中文關鍵詞:加強式學習
相關次數:
  • 被引用被引用:0
  • 點閱點閱:81
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文視證券市場操作為一控制問題,由於股票價格模型建立困難或具有時變性質,且傳統控制理論皆基於完整受控系統動態行為的數學模式發展,所以大部份方法對證券市場的問題將很難獲得解決。為了克服上述問題,希望尋找一種不需受控模型並可隨時變受控系統改變的控制方法。加強式學習符合這項要求,在學習過程中不需要環境(市場)模型或優秀的教導者指導,而且控制器將隨環境變化自我改善表現,最後經由學習過程自動歸納出對應的操作策略。在實證部份,將這項方法應用於台灣證券市場,因證券價格訊號盤整期間長和頻率變化大,實驗結果顯示操作反應有延遲現象,不過長期獲率仍令人滿意,驗證加強式學習於金融市場操作具初步成果。

This thesis considers stock market speculation as a control problem. It is difficult to build price models of stocks because of the time-variant property of stocks. Further, conventional control theorem is largely based on the dynamic behavior of mathematical model of the plant underlying control. This implies that most known control methods are not able to solve the stock speculation problem. To deal with the proposed problem, we wish to find a control method that does not require a mathematical model of the controlled plant, and will adjust the scheme itself while the plant is time-varying. The reinforcement learning suits for these requirements. In the learning system, the controller improves its behavior by adjusting internal parameters according to the interaction with environment. In our experiments, we apply the proposed method to Taiwan stock market. Due to the consolidation of stock and its high versatility, the experimental results display the delayed response of operation, but the long-period profits are satisfactory. We conclude that reinforcement learning for financial market speculation has the preliminary success.

目 錄
頁次
中文摘要i
英文摘要ii
誌謝iii
目錄iv
圖目錄vi
表目錄viii
符號說明ix
第一章 緒論1
1.1 研究動機及問題陳述1
1.2 章節說明2
第二章 加強式學習控制系統3
2.1簡介3
2.2加強式學習控制系統5
2.2.1加強訊號5
2.2.2有相關性與無相關性6
2.2.3嘗試錯誤法7
2.2.4 LRI學習法7
2.3 Q學習法8
2.4 類神經網路實現學習法則11
2.3.1控制搜尋單元11
2.3.2狀態評估單元14
第三章 加強式學習控制系統應用於金融操作17
3.1整體架構17
3.2資料特徵抽取(技術指標)20
3.3學習步驟25
第四章 實驗結果26
4.1正弦訊號26
4.2證券價格訊號36
4.3結果比較與討論45
第五章 結論48
參考文獻50

[1] Barto, A.G.,R.S. Sutton, C.W. Anderson(1983).“Neuronlike elements that can solve difficult learning control problem.”IEEE Trans. Syst. Man, Cybern. ,13,835-846.
[2] Barto, A.G.,P. Anandan(1985).“Pattern recognition stochastic learning automata.”IEEE Trans. Syst. Man, Cybern., 15, 360-374.
[3] Haykin, S.(1994). Neural networks: A comprehensive foundation. NJ: Prentice-Hall.
[4] Miller, W.T., R.S. Sutton, P.J. Werbos ed.(1990)Neural networks for control. The MIT press.
[5] Narendra, W.T., M.A.L. Thathachar(1989). Learning automata : an introduction. Englewood Cliffs, NJ: Prentice-Hell.
[6] Narendra, K.S.(1974). “Learning automata - A survey.”IEEE Trans. Syst. Man, Cybern., vol. SMC-4, no. 4.
[7] Narendra, K.S., S. Lakshmivarahan(1977).“Learning automata — A critique.”Journal of Cybernetics, and Information Science, 1, 53-56.
[8] Patterson, D.W.(1995). Artificial neural networks: Theory and applications. Singapore: Prentice-Hall.
[9] Pavol, I.P.(1927). Conditioned reflexes. London: Oxford Univ. Press.
[10] Rumelhar, D.E., G.E. Hinton, R.J. Williams(1986).“Learning representations by back-propagation error.”Nature, 323, 533-536.
[11] Robinson, A.J.,F. Fallside(1988).“Static and dynamic error propagation networks with application to speed coding.”Neural Information Processing System(Denver 1987), ed. D.Z. Anderson, 632-641. New York: American Institute of Physics.
[12] Ross, S(1983)Introduction to stochastic dynamic programming. San Diego academic press.
[13] Samuel, A.L.(1959).“Some studies in machine learning using the game of checkers.”IBM Journal on Research and Development, Vol.3, pp.210-229.
[14] Sutton, R.S.(1988).“Learning to predict by the methods of temporal difference.”Machine Learning,3: 9-44.
[15] Sutton, R.S.(1992).“Introduction: The challenge of reinforcement learning.”Machine Learning, 8: 225-227.
[16] Watkins, C.J.C.H.(1992).“Q-learning.”Machine Learning,8: 279-292.
[17] Werbos, P.J.(1988). “Generalization of back propagation with applications to a recurrent gas market model.”Neural Networks,vol.1,339-356.
[18] Chande, T.S. (民85). 最新技術分析指標. 台北: 寰宇出版公司.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
系統版面圖檔 系統版面圖檔