跳到主要內容

臺灣博碩士論文加值系統

(54.224.117.125) 您好!臺灣時間:2022/01/23 19:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:洪健哲
研究生(外文):Chien-Che Hung
論文名稱:XML文件漸進式結構化索引更新與路徑表示式檢索之研究
論文名稱(外文):Incremental Structured Index Update Mechanism and Path Expression Query for XML Documents
指導教授:柯皓仁柯皓仁引用關係楊維邦楊維邦引用關係
指導教授(外文):Hao-Ren KeWei-Pang Yang
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊科學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:59
中文關鍵詞:漸進式結構化索引更新k-ary樹狀結構路徑表示式檢索分裂方法可擴展標示語言
外文關鍵詞:Incremental Structured Index Updatek-ary TreePath Expression QuerySplit MethodXML
相關次數:
  • 被引用被引用:2
  • 點閱點閱:182
  • 評分評分:
  • 下載下載:19
  • 收藏至我的研究室書目清單書目收藏:2
XML文件中隱含了豐富的結構化資訊,為了能利用這些結構化資訊提供結構化檢索(Structured Query),本論文利用k-ary 樹狀結構的特性來建立結構化資訊索引(Structured Index)。基於此索引結構,本論文提出一套漸進式結構化索引更新機制來避免文件結構更新時需重新建置結構化資訊索引,此機制稱為Split Method。此外,本論文實作了一套SQL-like的路徑表示式檢索(Path Expression Query),可讓使用者在結構資訊不規則或不明確的情況下透過結構化資訊索引找出滿足檢索路徑條件的資料,並將檢索結果以XML呈現。
在結構化文件索引更新中,若加入資料的節點分支數超出索引結構之最大分支數k時,傳統的方法係將結構索引重建,當結構資料時常變更時,此索引結構重建的步驟相當耗時。有鑑於此,本論文乃提出一個漸進式結構化索引更新的機制來避免索引重新建置。此方法主要利用資料分屬不同文件的概念,即分裂的節點所屬之文件編號(Document Identifier, DID)與原始資料之DID的不同。當加入資料的節點超出索引結構之最大分支數時,從該節點分裂出一個節點,而分裂的節點另屬於不同的DID,並定義一個對等關係來記錄該節點與分裂節點間的關連性。
從效益評估結果顯示,雖然本論文提出之Split Method需要較多的索引空間花費,以本論文中的實驗為例,索引空間花費的Overhead約介於2.38%與17.03%之間,然而在索引更新的效能上大大優於索引重建的方法,且對於實際檢索的效能,我們的方法與原來索引重建的方法相較之下並未相差太大。
XML documents contain abundant structural information that can be employed to better understand the XML documents. Structured queries are queries that search the structural information of XML documents. In order to support structured queries, k-ary trees are used in this thesis as an index structure to store structural information. Based on this index structure, this thesis proposes an incremental structured index update mechanism and implements a few functions of a SQL-like path expression query language, called Lorel.
K-ary trees are a well-known index structure can be used to store structural information of XML documents, where k is a predefined maximum number of branches that a tree node can have. When an XML document is modified, its corresponding k-ary tree has to be updated as well for reflecting the modification. However, when a modification makes the number of branches of a tree node exceeds k, the k-ary tree is no longer workable and the whole k-ary tree has to be reconstructed. Reconstructing the whole k-ary tree is time-consuming when the structure information changes frequently. This thesis proposes an incremental structured index update mechanism called the Split Method to avoid reconstructing the k-ary tree when the number of branches of a node exceeds k. Through splitting nodes and defining an equivalent relationship, this mechanism can update k-ary trees efficiently. Several evaluations were conducted to assess the Split Method and the k-ary tree reconstruction. The evaluation shows that the Split Method only wastes a little more space than tree reconstruction, but it is more efficient than tree reconstruction in updating k-ary trees.
In addition, based on the k-ary tree structured index, this thesis implements a few functions of a SQL-like path expression query language, called Lorel. Path expression queries facilitate user to issue structural queries when the document structure is irregular or unknown in advance. The results of queries are formatted as XML documents.
英文摘要 I
中文摘要 III
誌謝 V
目錄 VI
表目錄 VIII
圖目錄 IX
方程式目錄 X
第一章 緒論 1
第一節 研究動機 1
第二節 研究目的 3
1.2.1 避免結構索引重建 3
1.2.2 實作XML檢索語言 4
第三節 本論文內容與架構 4
第二章 相關研究 5
第一節 XML技術背景介紹 5
2.1.1 概要 5
2.1.2 資料為主(Data-Centric)與文件為主(Document-Centric)的XML文件 7
2.1.3 XML文件物件模型(DOM) 8
第二節 XML檢索語言 9
第三節 結構化文件索引 11
第三章 漸進式結構化索引更新方法之設計 14
第一節 XML文件檢索系統架構 14
第二節 XML結構化資訊索引建置流程 15
第三節 漸進式結構化索引更新(Incremental Structured Index Update) 19
3.3.1 更新節點內容(Update Element Content) 20
3.3.2 刪除節點(Delete Element) 20
3.3.3 新增節點(Insert Element) 22
3.3.4 漸進式索引更新方法 ─ Split Method 23
第四章 路徑表示式檢索實作說明 26
第一節 簡易路徑檢索(Simple Path Query) 28
第二節 一般路徑檢索(General Path Query) 29
第三節 路徑條件檢查 33
第四節 XML輸出結果 35
第五章 XML檢索系統實作與結果分析 37
第一節 實作系統展示與說明 37
第二節 效益評估方法 40
5.2.1 索引空間花費 40
5.2.2 索引更新時間花費 41
5.2.3 檢索時間花費 42
第三節 索引更新方法之效益評估 44
5.3.1 索引空間花費比較 45
5.3.2 索引更新時間花費比較 48
5.3.3 檢索時間花費比較 51
5.3.4 索引重建 54
第六章 結論與未來研究方向 56
第一節 結論 56
第二節 未來研究方向 57
參考文獻 58
1. [Abiteboul97] S. Abiteboul, D. Quass, J. McHugh, J. Widom, J. L. Wiener. “The Lorel query language for semistructured data,” International Journal on Digital Libraries, Volume 1, Issue 1, 1997, pages 68-88.
2. [Bertino01] E. Bertino, B. Catania. “Integrating XML and Databases,” IEEE Internet Computing, Volume 5, Issue 4, July-Aug. 2001, pages 84-88.
3. [Bonifati02] A. Bonifati, D. Braga, A. Campi, and S. Ceri. “Active XQuery,” Data Engineering, 2002. Proceedings. 18th International Conference on, 2002, pages 403-412.
4. [Bonifati00] A. Bonifati and S. Ceri. “Comparative Analysis of Five XML Query Languages,” SIGMOD RECORD, Volume 29, Number 1, 2000, pages 68-79.
5. [Bryan92] M. Bryan. “An Introduction to the Standard Generalized Markup Language (SGML),” 1992, http://www.personal.u-net.com/~sgml/sgml.html
6. [Ceri99] S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi, L. Tanca. “XML-GL: A Graphical Language for Querying and Restructuring XML Documents,” Proc. WWW8, Toronto, Canada, May 1999.
7. [Chamberlin01] D. D.Chamberlin. “Query languages and XML,” Database Engineering & Applications, 2001 International Symposium on, 2001, pages 297-300
8. [CML] Chemical Markup Language (CMLTM) http://www.xml-cml.org/
9. [Deutsch98] A. Deutsch, M. Fernandez, D. Florescu, Alon Levy and D. Suciu. “XML-QL: A Query Language for XML,” In Proc. of the Query Languages workshop (QL98), Cambridge, Mass., December 1998, http://www.w3.org/TR/NOTE-xml-ql
10. [DOM98] Document Object Model (DOM). http://www.w3.org/DOM/
11. [Flesca02] S. Flesca, F. Furfaro, and S. Greco. “A Graphical XML Query Language,” Data Engineering, 2002. Proceedings. 18th International Conference on, 2002, pages: 264.
12. [Goldman99] R. Goldman, J. McHugh, and J. Widom. “From Semistructured Data to XML: Migrating the Lore Data Model and Query Language,” In 2nd ACM SIGMOD Int. Workshop on The Web and Databases, 1999, pages 25-30.
13. [Jang99] H. Jang, Y. Kim, and D. Shin. "An effective mechanism for index update in structured documents," Proceedings of the eighth international conference on Information knowledge management November 1999, pages 383-390.
14. [Kanemoto98] H. Kanemoto, H. Kato, H. Kinutani, and M. Yoshikawa. “An efficiently updatable index scheme for structured documents,” Database and Expert Systems Applications, 1998. Proceedings. IEEE Computer Society, Ninth International Workshop on, 1998, pages 991-996.
15. [Larson01] P. Larson. “XML Data Management Go Native or Spruce up Relational Systems?,” ACM SIGMOD 2001 Santa Barbara, California, May 21-24, 2001 (panel abstract)
16. [Lee96] Y. K. Lee, S. J. Yoo, and K. Yoon, “Index Structures for Structured Documents,” Proceedings of the 1st ACM international conference on Digital libraries, 1996, pages 91-99.
17. [MathML] Mathematical Markup Language (MathML™) http://www.w3.org/TR/REC-MathML/
18. [Robie98] J. Robie, J. Lapp and D. Schach. “XML Query Language (XQL),” In Proc. of the Query Languages workshop, Cambridge, Mass., Dec. 1998, http://www.w3.org/TandS/QL/QL98/pp/xql.html
19. [Roy00] J. Roy, A. Ramanujan. ”XML: data''s universal language,” IT Professional, Volume 2, Issue 3, May-June 2000
pages 32-36.
20. [XML98] Extensible Markup Language (XML) 1.0 http://www.w3.org/XML
21. [XML00] Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation 6 October 2000 http://www.w3.org/TR/2000/REC-xml-20001006
22. [XSL01] The Extensible Stylesheet Language (XSL) Version 1.0, Oct. 2001, http://www.w3.org/TR/xsl
23. [XSLT99] XSL Transformations (XSLT) Version 1.0, Nov. 1999, http://www.w3.org/TR/xmlt
24. [Zisman00] A. Zisman. "An overview of XML," Computing and Control Engin. J., Volume 11, Aug. 2000, pages 165-167.
25. [黃中杰00] 黃中杰、王天利,「XML新網頁語言開發手冊」,知城數位,2000年十二月。
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top