論文名稱(外文):Problem Solving Recommendation Model based on Machine Learning for an Issue Tracking System
指導教授(外文):Hei-Chia Wang
外文關鍵詞:Issue Tracking SystemIssue ClassificationText Summarization
全球電子化的普及,提高了各行各業對軟體的依賴度,客製化軟體的需求亦不斷增加,面對來自不同使用者提出的各種異動需求,專案管理者需透過管理工具-議題追蹤系統(Issue Tracking System,ITS)來確保每個事項都能準確地被追蹤與執行。然而,ITS的項目包羅萬象,要使用人工方式來過濾或分類這些議題,是一項繁瑣、耗時且不具效益的工作;另一方面,每一份議題報告都包含許多描述性的自然語言,在ITS有限的查詢條件中,查找相似的問題不是件容易的事,因而造成一個問題多人提報的情形,不但增加開發人員的工作負擔,也同時提高了整個專案的成本。
Global electronically popularization has increased the reliance on software, the demand for customized software is also increasing and facing the various transaction requests from different users, project managers need to ensure that everything is accurately tracked and executed through the management tool, Issue Tracking System (ITS). However, ITS projects are all-encompassing and it is a cumbersome, time-consuming and unproductive job to use manual methods to filter or classify these issues. On the other hand, each issue report contains many descriptive natural languages. In the limited query conditions of ITS, it is not easy to find similar problems, thus causing a problem to be reported by many people, not only increases the workload of the program developer but also increases the cost of the project.
In the past, many studies have proposed various automated methods to classify or group issue reports, but most of these studies focus on categorizing issues by severity or finding relevance between issues. In fact, the reply record of the issue report usually has the processing history and solutions, which are useful information for the assignee.
This paper proposes a solution recommendation model for classifying issues and automatically summarizing the solution. It can be known from the experimental results that the solution recommendation model can help the assignee to obtain the solution of similar issues, thereby improving the processing efficiency.
1. 緒論 1
1.1. 研究背景與動機 1
1.2. 研究目的 4
1.3. 研究範圍與限制 5
1.4. 研究流程 6
1.5. 論文架構 7
2. 文獻探討 8
2.1. 議題追蹤系統 8
2.2. 特徵選取 9
2.2.1. 詞頻 10
2.2.2. 詞頻-逆向文件頻率 10
2.2.3. N元模型 11
2.3. 分類方法 12
2.3.1. K-近鄰演算法 12
2.3.2. 樸素貝氏分類法 13
2.3.3. 支持向量機 14
2.3.4. 隨機森林 15
2.3.5. 交叉驗證 17
2.4. 分群方法 19
2.4.1. K-means 19
2.4.2. 聚合式階層分群法 20
2.4.3. 平均側影法 21
2.5. 自動擷取摘要方法 21
2.5.1. TextRank 22
2.6. 小結 23
3. 研究方法 24
3.1. 研究架構 24
3.2. 資料前處理 28
3.2.1. 文字處理 29
3.3. 功能分類 31
3.3.1. 隨機森林模型訓練 31
3.4. 議題分群 32
3.5. 自動擷取問題筆記摘要 33
4. 系統建置與驗證 36
4.1. 系統建置環境 36
4.2. 實驗設計 37
4.2.1. 實驗資料集 37
4.2.2. 衡量指標 39
4.2.3. 實驗一、功能分類 42
4.2.4. 實驗二、議題分群 43
4.2.5. 實驗三、自動摘要 47
4.3. 實驗結果說明 48
5. 結論與未來研究 49
5.1. 研究貢獻與結論 49
5.2. 未來研究方向 51
參考文獻 52
