研究生(外文):Yu-Hsin Lin
論文名稱(外文):The Extraction of Process Information from Chinese Text Document and to Construct Process Model
指導教授(外文):Chao Ou-Yang
外文關鍵詞:Chinese Word SegmentationInformation Extraction
因此,本研究將建立一套規則,以擷取流程文件中任一篇幅流程描述內容中所需要的資訊,並以事件驅導流程鏈(Event-driven Process Chains ,EPC)來做為企業流程圖表達的工具。其間將利用中文斷詞系統將流程描述作簡化,再透過制訂擷取出所需要資訊的規則,與事件流程先後順序的邏輯演算法,找出流程資訊並建立完整的流程模型,最後透過準確率與召回率等績效評估方式,計算斷詞後擷取的資訊是否確實為流程所需的資訊。之後,再將依流程文章所建構的流程圖與原始流程圖做流程作相似度差異的比較,讓使用者了解流程文章與原始流程圖間的差異性有多大。
Due to the rapid development of enterprise globally, the documents used to describe business process have become very tedious and might be difficult to catch the required information. In addition, the process diagrams embedded in the documents usually were not properly mapped with the text description. This might due to either a very detailed text description versus a simplified process diagram or vise versa. However, both of the situations might mislead the readers due to incompatible information between text and process diagram.
This research tends to propose a method to extract the process related information from Chinese text documents and to construct the related process diagram. An approach using word and sentence segmentation would be developed. The segmented words and sentences would be extracted and added tags. An algorithm will be developed to compute the precedence and logical relationships among the tagged words and sentences and to construct the related process models. The constructed models would be compared with the process diagrams embedded in the original document in terms of the index such as precision and recall. The computed values can be used as a reference to analyze the variance among the constructed and original process models.
This research will be addressed on both of the English and Chinese documents in two years of period.
摘要 I
Abstract II
致謝 III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 1
1.3 研究問題定義 2
1.4 研究論文架構 3
第二章 文獻探討 4
2.1資訊之處理 4
2.1.1 資訊的擷取 4
2.1.2 中文斷詞法 6
2.2 知識解析 10
2.2.1 斷詞與詞類標註 10
2.2.2 時間知識表達 12
2.2.3 時間關係推論 12
2.3 績效評估 14
2.3.1 準確率與召回率 14
2.3.2 流程相似度 14
第三章、研究方法 17
3.1 概念階段 19
3.1.1 問題領域分析 19
3.1.2 概念階段流程解說 20
3.2 設計階段 20
3.2.1 資訊擷取 21 文章斷句 21 找尋動詞 22 擷取Function 23 先後順序與邏輯判斷規則 25 Function、時序與邏輯的驗證 32
3.2.3 繪製流程圖 35
3.2.4 比較與分析 37 流程資訊整理 37 流程圖相似度計算 38
3.3 實作階段 40
第四章 實證研究 43
4.1 系統架構 43
4.1.1 斷詞系統 43
4.1.2 使用者介面 44
4.1.3 EPC Tools軟體工具 44
4.1.4 實作環境 45
4.1.5 實作研究限制 45
4.2 實例驗證 45
4.2.1 流程文章「成品出貨」 45
4.2.2 流程文章「物料需求規劃」 65
第五章 結論與建議 82
5.1 結論 82
5.2 研究貢獻 82
5.3 未來研究建議 82
參考文獻 84
附錄 90
