臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.87) 您好！臺灣時間：2026/07/01 20:44

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

曹怡亭

研究生(外文):

Yi-Ting Tsao

論文名稱:

在增強式學習中以拉普拉斯運算為基礎做離散狀態值函式轉換

論文名稱(外文):

Laplacian Based State-value Function Transfer in Discrete Reinforcement Learning

指導教授:

蘇豐文

指導教授(外文):

Von-Wun Soo

學位類別:

碩士

校院名稱:

國立清華大學

系所名稱:

資訊工程學系

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2008

畢業學年度:

語文別:

英文

論文頁數:

中文關鍵詞:

增強式學習、轉換學習、拉普拉斯運算

外文關鍵詞:

Reinforcement Learning、Transfer Learning、Laplacian

相關次數:

被引用:0
點閱:239
評分:
下載:12
書目收藏:1

在學習這件事中，分別學習兩個相似的問題可能會造成時間上的浪費，而這個浪費是由於重複學習同樣的子問題所造成。因此，所謂的轉換學習指得是縮短學習相似問題的時間，也就是再利用某一問題所學到的知識。很多過去的研究都注重於如何利用轉移函數來做知識的轉換，但是設計轉換函數必須要十分了解問題的特性，因此就算是此問題的專家也是不容易的一件事，所以我們提出一個以拉普拉斯為基礎的轉換方式。我們修改了原有的拉普拉斯運算，使之變成不只反應問題的拓撲特性，也反應在增強式學習中所需的獎勵。此外，我們也簡單地敘述這個方法的特性和解釋為何這個方法可以達到加速學習的目的。透過這個方法，我們可以直接轉換兩個增強式學習的問題而不需要轉換函式。在本論文中，我們研究三種不同的轉換形式，而從實驗的結果中可知這樣的轉換方式對於減少學習時間是有幫助的。

Abstract iii
Acknowledgement iv
Contents v
Chapter 1 Introduction 2
1.1 Problem Statement 2
1.2 Related Work 3
1.3 The Transfer Types 4
Chapter 2 Background 6
2.1 Markov Decision Process 6
2.2 Reinforcement Learning 7
2.3 Laplacian 8
2.4 Example 9
Chapter 3 Methodology 11
3.1 The Modified Laplacian 11
3.2 The Property 12
3.3 The Transfer Method 14
Chapter 4 Experiments 16
4.1 The Scaling Cases 18
4.2 The Topological transfer Cases 21
4.3 The Reward Change Case 25
4.4 The Maze Case 27
Chapter 5 Discussions 30
Chapter 6 Future Work 32
Reference 33

Chung FRK. 1997. Spectral graph theory: American Mathematical Society.
Hessling Av, Goel AK. 2005. Abstracting reusable cases from reinforcement learning. In Proceedings of the Sixth International Conference on Case-Based Reasoning Workshop.
Kimberly F, Mahadevan S. 2006. Proto-transfer learning in Markov decision processes using spectral methods. In Proceedings of the Twenty-Third International Conference on Machine Learning Workshop on Structural Knowledge Transfer for Machine Learning.
Liu Y, Stone P. 2006. Value-function-based transfer for reinforcement learning using structure mapping. In Proceedings of the Twenty-First National Conference on Artificial Intelligence. p 415-420.
Mahadevan S. 2005. Proto-value functions: Developmental reinforcement learning. In Proceedings of the Twenty-Second International Conference on Machine Learning.
Mahadevan S, Maggioni M. 2006. Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Technical Report.
Mahadevan S, Maggioni M. 2007. Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research 8:2169-2231.
Puterman ML. 2005. Markov decision processes discrete stochastic dynamic programming: Wiley.
Russell S, Norvig P. 2003. Artificial intelligence a modern approach: Prentice Hall.
Sutton RS, Barto AG. 1998. Reinforcement learning an introduction: MIT press.
Taylor ME, Stone P. 2007. Cross-domain transfer for reinforcement learning. In Proceedings of the Twenty-Fourth International Conference on Machine Learning. p 879-886.
Taylor ME, Stone P, Liu Y. 2005. Value functions for RL-based behavior transfer: A comparative study. In Proceedings of the Twentieth National Conference on Artificial Intelligence. p 880-885.
Taylor ME, Whiteson S, Stone P. 2007. Transfer via inter-task mappings in policy search reinforcement learning. In Proceedings of the Sixth International Conference on Autonomous Agents and Multiagent Systems.
Tsao Y-T, Xiao K-T, Soo V-W. 2008. Graph Laplacian based transfer learning in reinforcement learning. In Proceedings of the Seventh International Conference on Autonomous Agents and Multiagent Systems. p 1349-1352.

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	ARobustNetworkAlignmentAlgorithmforDetectingEvolutionarilyConservedProteinComplexesAcrossSpecies
2.	多重代理人模擬種族優越意識行為對於方言字彙分化的影響
3.	基於增強式學習方法自動產生笑話
4.	使用潛在語意分析與自我組織映射於中文文件摘要
5.	用時間分割與時間平移方法比對微陣列時間序列資料來分析酵母菌細胞週期的基因互動
6.	在獵人與獵物問題中利用模組化增強式學習法學習團隊分組
7.	DealingwithPerceptualAliasingbyUsingPruningSuffixTreeMemoryinReinforcementLearning
8.	AnalysisonAdverseDrugReactionCasesandInductionofFeaturesforTheirRelatedProteins
9.	理性代理人任務重分配協商下的神諭學習法
10.	以時間關係擴充故事素材模型
11.	LearningCombatingStrategiesinRealTimeStrategyGamesBasedonGeneticAlgorithmsandTransferLearning
12.	利用增強式學習法來學習漢語片語結構的剖析
13.	以遞歸卷積神經網路擷取財經新聞知識預測股價
14.	基於背景資訊與篇章結構來解析小說故事中的代名詞
15.	基於抗藥模擬及臨床驗證之乳癌化療賽局模型

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室