

( 您好!臺灣時間:2024/12/12 09:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Su-Hsien Huang
論文名稱(外文):Metadata Architecture for Digital Library Integration
指導教授(外文):Wei-Pang YangHao-Ren Ke
外文關鍵詞:digital library integrationmetadata modeldata extractionsemantic querysemantic inference
  • 被引用被引用:2
  • 點閱點閱:557
  • 評分評分:
  • 下載下載:169
  • 收藏至我的研究室書目清單書目收藏:10
Metadata has been playing an essential role in integrating heterogeneous digital libraries (DL). However, conventional metadata architecture is insufficient to achieve interoperability among DL because of the heterogeneity in semantics and no structure consideration in metadata formats. This dissertation proposes novel metadata architecture called M-Architecture@DL to integrate DL seamlessly from the perspective of metadata. M-Architecture@DL follows Model-Extraction-Query (MEQ) model to obtain more permanent and explicit knowledge in the process of DL query. M-Architecture@DL contains three layers, namely metadata modeling layer, data extraction layer, and semantic query layer. The separation of M-Architecture@DL into three-layer achieves format, protocol and semantic interoperability in each layer.
Metadata modeling layer uses Metadata Modeling Language (MML) to describe real-world entities. MML adopts XML as its syntax and extends Resource Description Framework (RDF) by adding name hierarchy reference. MML provides two constructors, tuple and set constructors, to represent structures. With these two constructors, metadata can be translated by manipulating attributes of metadata with operations. In this layer, the format interoperability is achieved.
Data extraction layer collects data from distributed DL and encapsulates result into MML metadata. Data from DL services with similar structure can be extracted into metadata automatically by means of the common structure. In the process of extraction, the first step is to assign level ID for the sample document and determine the common part to be extracted. Then an extraction algorithm called Metadata Extractor is implemented to extract the documents according to the common structure. This layer provides a transparent way without prearrangement with distributed DLs and saves much effort to collect information through the HTTP protocol. Therefore, the protocol interoperability is achieved.
Semantic query layer retrieves metadata semantically by adding relationships in query statements. A Content and Service Inference Model (CSIM) is proposed to derive 15 relationships from two essential aspects of DL: content and services. The 15 structural relationships create operations to manipulate metadata in a query predicate and facilitate a query with as much semantics. Both content and service queries are presented to derive more semantic answers in a DL search. In this layer, the semantic interoperability is achieved.
Experiments are conducted and indicate that M-Architecture@DL has excellent performance in DL integration. The experiment results have shown that both accuracy and coverage are improved to a conventional keyword-based approach. Adopting M-Architecture@DL can alleviate the administrative load. When developing novel DL services, such as library resource planning and virtual union catalog system, librarians are recommended with alternative answers to combine existent DL components. The reuse of DL services and metadata is the future trend in DL integration.
Chapter 1 Introduction 1
1.1 Background 1
1.2 Related work 3
1.2.1 Digital library architecture 3
1.2.2 Metadata model 4
1.2.3 Metadata architecture 5
1.3 Goal 6
Chapter 2 Digital Library Integration Architecture 8
2.1 Conceptualization 8
2.2 Scenario 9
2.3 M-Architecture@DL 10
Chapter 3 Metadata Modeling 13
3.1 Data model 14
3.2 Structure expression 17
3.3 Translation service 18
3.4 Implementation (Metadata Modeling Language – MML) 19
3.5 Comparison 24
Chapter 4 Data Extraction 25
4.1 Structure hierarchy 25
4.1.1 Level-ID assignment 27
4.1.2 Auxiliary table 28
4.2 Common structure 30
4.3 Implementation 30
4.3.1 Pre-processing phase 31
4.3.2 Structure labeling phase 32
4.3.3 Data extraction phase 33
Chapter 5 Semantic Query 35
5.1 Content and service inference model (CSIM) 36
5.1.1 Relationships between content and services 36
5.1.2 Basic definitions 38
5.1.3 Definitions of content and service relationships 40
5.1.4 Manipulating operations 42
5.2. Semantic digital library query 46
5.2.1 Query language 46
5.2.2 EXACT and AMBIGUOUS query 47
5.2.3 Ranking function 51
5.3 Implementation 53
Chapter 6 Experiments 55
6.1 Experimental set-up and approach 55
6.2 Experimental metrics 56
6.3 Experiment results 58
Chapter 7 Concluding Remarks 62
Bibliography 66
Appendix 73
[1]. Abiteboul, S., Benjelloun, O., & Milo, T. (2002) Web services and data integration, Proceedings of the Third International Conference on Web Information and System Engineering, (pp.1-6), Singapore.
[2]. Arasu, A., & Garcia-Molina, H. (2003) Extracting structured data from web pages, Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, (pp. 337-348), San Diego, California.
[3]. Arms, W. Y. (1995) Key concepts in the architecture of the digital library, D-Lib Magazine, http://www.dlib.org/.
[4]. Baldonado, M., Chang, C. C. K., Gravano, L., & Paepcke, A. (1997) The stanford digital library metadata architecture, International Journal on Digital Libraries, 1(2), 108-121.
[5]. Berners-Lee, T. (1997) Metadata architecture - documents, metadata, and links, http://www.w3.org/DesignIssues/Metadata/.
[6]. Bhoopalam, K., Maly, K., McCown, F., Mukkamala, R., & Zubair, M. (2005) A standards-based approach for supporting dynamic access policies for a federated digital library, Proceedings of the 8th International Conference on Asian Digital Libraries (ICADL2005), (pp.242-252), Bangkok, Thailand.
[7]. Blanchi, C., & Petrone, J. (2001) Distributed interoperable metadata registry, D-Lib Magazine, December, 7(12), http://www.dlib.org/dlib/december01/blanchi/ 12blanchi.html.
[8]. Chan, L. M., & Zeng, M. L. (2006) Metadata Interoperability and Standardization – A Study of Methodology Part I, D-Lib Maganize, June, 12(6), http://www.dlib.org/dlib/june06/chan/06chan.html.
[9]. Chen, H. (2005) Digital library development in the Asia pacific, The 8th International Conference on Asian Digital Libraries, 2005, (pp.509-524), Bangkok, Thailand.
[10]. Coyle, K. (2000) The virtual union catalog: a comparative study, D-Lib Magazine, 6(3), http://www.dlib.org/dlib/march00/coyle/03coyle.html.
[11]. Digital Library Production System - DLPS (1999) Supporting access to diverse and distributed finding aids: A final report to the digital library federation on the distributed finding aid server project, http://www.umdl.umich.edu/dlps/dfas/ dfas-final.html.
[12]. Durfy, E. H., Kiskis, D. L., & Birmingham, W. P. (1997) The agent architecture of the university of Michigan digital library, Proceedings of IEE Software Engineering, 144(1), (pp.61-71).
[13]. Derbyshire, D., Ferguson, I. A., Muller, J. P., Pischel, M., & Wooldridge, M. (1997) Agent-based digital libraries: driving the information economy, enabling technologies: infrastructure for collaborative enterprises, Proceedings of the 6th Workshop on Enabling Technologies Infrastructure for Collaborative Enterprises, (pp.82-86), Washinton, U.S.A.
[14]. Ding, H., & Solvberg, I. (2005) Choosing appropriate peer-to-peer infrastructure for your digital libraries, Proceedings of the 8th International Conference on Asian Digital Libraries (ICADL2005), (pp.457-462), Bangkok, Thailand.
[15]. Dushay, N., French, J. C. & Logoze, C. (1999) Using query mediators for distributed searching in federated digital libraries, Proceedings of the Fourth ACM Conference on Digital Libraries, (pp. 171-178), Berkeley, U.S.A.
[16]. Fox, E. A., Suleman, H., & Luo, M. (2002) Building digital libraries made easy: Toward open digital libraries, Proceedings of the Fifth International Conference on Asian Digital Libraries (ICADL2002), (pp. 14-24), Singapore.
[17]. Goncalves, M. A., Fox, E. A., Watson, L.T., & Kipp, N. A. (2004) Streams, structures, spaces, scenarios, societies (5S): A formal model for digital libraries, ACM Transactions on Information System, 22(2), 270-317.
[18]. Grossman, R., Qin, X., & Xu, W. (1995) An architecture for a scalable, high-performance digital library, Proceedings of the Fourteenth IEEE Symposium on Mass Storage Systems, (pp. 11-14), Monterey, California.
[19]. Han, H., Giles, C. L., Manavoglu, E., & Zha, H. (2003) Automatic document metadata extraction using support vector machines, 2003 Joint Conference on Digital Libraries (JCDL'03), (pp.37-48), Houston, U.S.A.
[20]. Hirokawa, S., Itoh, E., & Miyahara, T. (2003) Semi-automatic construction of metadata from a series of web documents, Australian Conference on Artificial Intelligence 2003, (pp.942-953), Perth, Australia.
[21]. Ho, H. I., & Hsiang, J. (2005) Configurable meta-search for integrating Web public access catalogs, International Conference on Asian Digital Libraries (ICADL2005), (pp.317-322), Bangkok, Thailand.
[22]. Hu, M. J., & Jian, Y. (1999) Multimedia description framework (MDF) for content description of audio/video documents, Proceedings of the Fourth ACM International Conference on Digital Libraries, (pp. 67-75), Berkeley, U.S.A.
[23]. Huang, S. H., Ke, H. R., & Yang, W. P. (2000) Interoperability of cooperative databases with metadata, The Fourth World Multiconference on Systemics, Cybernetics and Informatics (SCI2000), (pp. 169-174), Orlando, U.S.A.
[24]. Huang, S. H., Ke, H. R., & Yang, W. P. (2000) Information extraction for documents with common structure, The Third International Conference of Asian Digital Library (ICADL2000), (pp. 105-112), Seoul, Korea.
[25]. Huang, S. H., Ke, H. R., & Yang, W. P. (2005) Enhancing semantic digital library query using a content and service inference model (CSIM)“, Information Processing & Management, 41, 891-908.
[26]. Huang, S. H., Ke, H. R., & Yang, W. P. (2006) Using metadata to integrate digital libraries by three-layer architecture, WSEAS Transactions on Computers, 10(5), 2301-2308.
[27]. Huang, S. H., Ke, H. R., & Yang, W. P. (2006) Metadata architecture for digital library integration, Proceedings of the 10th WSEAS International Conference on COMPUTERS, (pp.717-722), Athens, Greece.
[28]. Hung, J. F. (1999) WebOpac and catalog module of library information system, Master Thesis, Department of Computer Science and Information Engineering, National Chung Cheng University, ChyaYi, Taiwan.
[29]. Iverson, L. (2004) Collaboration in digital libraries: a conceptual framework, Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, (pp.380-380), Tuscon, U.S.A.
[30]. Jenkins, C., Jackson, M., Burden, P., & Wallis, J. (1999) Automatic RDF metadata generation for resource discovery, http://www.scit.wlv.ac.uk/~ex1253/rdf_paper/.
[31]. Kahn, R., & Wilensky, R. (1995) A framework for distributed digital object services, http://www.cnri.reston.va.us/cstr/arch/k-w.html.
[32]. Ke, H. R., Huang, S. H. & Yang, W. P. (2001). The study of interoperability of digital libraries with metadata. University Library Journal, 5(1), 49-78.
[33]. Kelapure, R., Goncalves, M. A., & Fox, E. A. (2003) Scenario-based generation of digital library services, Proceedings of the 7th Conference on Research and Advanced Technology on Digital Libraries (ECDL2003), (pp. 263-275), Trondheim, Norway.
[34]. Kim, H., Choo, C. Y., & Chen, S.S. (2003), An integrated digital library server with OAI and self-organizing capabilities, Proceedings of the 7th Conference on Research and Advanced Technology on Digital Libraries (ECDL2003), (pp. 105-112), Trondheim, Norway.
[35]. Kovacs, L., & Micsik, A. (2005) An ontology-based model of digital libraries, International Conference on Asian Digital Libraries (ICADL2005), (pp.38-43), Bangkok, Thailand.
[36]. Kumar, A., Saigal, R., Chavez, R., & Schwertner, N. (2004) Architecting an extensible digtal repository, Proceedings of the Joint Conference on Digital Libraries 2004 (JCDL’04), (pp. 2-10), Tucson, Arizona.
[37]. Lagoze, C. (1996) The Warwick framework, D-Lib Magazine, http://www.dlib.org/.
[38]. Lagoze, C., Kraff, D., Cornwell, T., & Eckstrom, D. (2006) Representing contextualized information in the NSDL, European Conference on Digital Library, (pp.329-340), Alicante, Spain.
[39]. Lynch, C., & Garcia-Molina, H. (1995) Interoperability, scaling and the digital libraries research agenda, Informaiton Infrastructure Technology and Applications (IITA) a Digital Libraries workshop, http://www-diglib.stanford.edu/diglib/pub/ reports/ iita-dlw/main.html.
[40]. McCray, A. T., Gallagher, M. E., & Flannick, M. A. (1999) Extending the role of metadata in a digital library system, Proceedings of the IEEE Research and Technology Advances in Digital Libraries, 1999 (ADL99), (pp. 190-199), Baltimore, U.S.A.
[41]. McCray, A. T., & Gallagher, M. E. (2001) Principles for digital library development, Communications of the ACM, 44(5), 49-54.
[42]. Melnik, S., Garcia-Molina, H., & Paepcke, A. (2000) A mediation infrastructure for digital library services, Proceedings of the Fifth ACM International Conference on Digital Libraries, (pp. 123-132), San Antonio, U.S.A.
[43]. Miller, P. (2000), Interoperability, what is it and why should I want it, http://www.ariadne.ac.uk/issue24/interoperability/intro.html.
[44]. Miyahara, T., Suzuki, Y., Shoudai, T., Uchida, T., Takahashi, K., & Ueda, Hi (2004) Discovery of maximally frequent tag tree patterns with contractible variables from semistructured documents, Proceedings of Advances in Knowledge Discovery and Data Mining, 8th Pacific-Asia Conference, (pp.133-144), Sydney, Australia
[45]. Monch, C., & Drobnik, O. (1998). Integrating new document types into a digital library, Proceedings of the IEEE Research and Technology Advances in Digital Libraries, 1998 (ADL98), (pp. 56-65), Santa Barbara, California.
[46]. Napoli, C. D., & Giordano, M. (2004) A service-oriented customizable digital library, Proceedings of ACM Symposium on Applied Computing (SAC’04), (pp. 1730-1731), Nicosia, Cyprus.
[47]. Neuroth, H., & Bargheer, M. (2003) Metadata models - international developments and implementation, Electronic Information and Communication in Mathematics 2002, (pp.112-121), Beijing, China
[48]. Nikolaou, C., & Marazakis, M. (1998) System infrastructure for digital libraries: A survey and outlook, Proceeding of the 25th Conference on Current Trends in Theory and Practice of Informatics (SOFSEM98), (pp. 186-203), Jasna, Slovakia.
[49]. Paepcke, A., Cousins, S. B., Garcia-Molina, H, Hassan, S. W., Ketchpel, S. P., Roscheisen, M., & Winograd, T. (1996) Using distributed objects for digital library interoperability, IEEE Computer, 29(5), 61-68.
[50]. Paepcke, A., Chang, C. C. K., Winograd, T., & García, M. H. (1998), Interoperability for digital libraries worldwide, ACM Communication, 41(4), 33-43.
[51]. Petrou, C., Hadjiefthymiades, S., & Martakos, D. (1999) An XML-based, 3-tier scheme for integrating heterogeneous infromation sources to the WWW, Proceeding of Database and Expert Systems Applications on Tenth International Workshop (DEXA99), (pp.706-710), Florence, Italy
[52]. Podnar, I., Luu, T., Rajman, M., Klemm, F., & Aberer, K. (2006) A peer-topeer architecture for information retrieval across digital library collections, Proceedings of the 10th Conference on Research and Advanced Technology on Digital Libraries (ECDL2006), (pp. 14-25), Alicante, Spain.
[53]. Roscheisen, M., Baldonado, M., Chang, K., Gravano, L., Ketchpel, S., & Paepcke, A. (1997) The stanford infobus and its service layers, http://www-diglib.stanford.edu.
[54]. Shen, R., Vemuri, N. S., Fan, W., Torres, R.S., & Fox, E. A. (2006) Exploring digital libraries: Integrating browsing searching and visualization, Joint Conference on Digital Libraries 2006 (JCDL’06), (pp.1-10), North Carolina, U.S.A.
[55]. Weatherley, J. (2006) A web service framework for embedding discovery services in distributed library interfaces, Joint Conference on Digital Libraries 2005 (JCDL’05), (pp.42-43), Colorado, U.S.A.
[56]. Weinstein, P. C., & Birmingham, W. P. (1998) Creating ontological metadata for digital library content and services, International Journal on Digital Libraries, 2(1), 20-37.
[57]. Weinstein, P. C., Birmingham, W. P., & Durfee, E. F. (1999) Agent-based digital libraries: decentralization and coordination, IEEE Communications Magazine, 37(1), 110-115.
第一頁 上一頁 下一頁 最後一頁 top