|
[1]Cascading Style Sheets (CSS). - http://www.w3.org/Style/CSS/. [2]Document Object Model (DOM). - http://www.w3.org/DOM/. [3]BERGMAN, M.K., The Deep Web: Surfacing Hidden Value. Journal of Electronic Publishing, 2001. 7(1): p. 07-01. [4]Buyukkokten, O., H. Garcia-Molina, and A. Paepcke, Accordion summarization for end-game browsing on PDAs and cellular phones. Proceedings of the SIGCHI conference on Human factors in computing systems, 2001: p. 213-220. [5]Buyukkokten, O., H. Garcia-Molina, and A. Paepcke, Seeing the whole in parts: text summarization for web browsing on handheld devices. Proceedings of the 10th international conference on World Wide Web, 2001: p. 652-662. [6]Cai, D., X. He, J.R. Wen, and W.Y. Ma, Block-level link analysis. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, 2004: p. 440-447. [7]Cai, D., S. Yu, J.R. Wen, and W.Y. Ma, Block-based web search. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, 2004: p. 456-463. [8]Cai, D., S. Yu, J.R. Wen, and W.Y. Ma, Extracting content structure for web pages based on visual representation. Proc. 5 thAsia Pacific Web Conference, 2003. [9]Chia-Hui Chang and S.-C. Lui, IEPAD: information extraction based on pattern discovery. Proceedings of the 10th international conference on World Wide Web, 2001: p. 681-688. [10]Fernandes., D., E.S.d. Moura., B. Ribeiro-Neto., A.S.d. Silva., and M.A. Gonçalves., Computing Block Importance for Searching on Web Sites, in CIKM. 2007. [11]Gupta, S., G. Kaiser, D. Neistadt, and P. Grimm, DOM-based content extraction of HTML documents. Proceedings of the 12th international conference on World Wide Web, 2003: p. 207-214. [12]Hearst, M.A., Multi-paragraph segmentation of expository text. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, 1994: p. 9-16. [13]Kao, H.Y., M.S. Chen, S.H. Lin, and J.M. Ho, Entropy-based link analysis for mining web informative structures. Proceedings of the eleventh international conference on Information and knowledge management, 2002: p. 574-581. [14]Lin, S.H. and J.M. Ho, Discovering informative content blocks from Web documents. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002: p. 588-593. [15]Liu, B., R. Grossman, and Y. Zhai, Mining data records in Web pages. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003: p. 601-606. [16]Ponte, J.M. and W.B. Croft, Text Segmentation by Topic. Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries 1997: p. 113 - 125. [17]Salton, G., A. Singhal, C. Buckley, and M. Mitra, Automatic text decomposition using text segments and text themes. Proceedings of the the seventh ACM conference on Hypertext, 1996: p. 53-65. [18]Shannon, C.E., A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 2001. 5(1): p. 3-55. [19]Song, R., H. Liu, J.R. Wen, and W.Y. Ma, Learning block importance models for web pages. Proceedings of the 13th international conference on World Wide Web, 2004: p. 203-211. [20]Tseng, Y.F. and H.Y. Kao, The Mining and Extraction of Primary Informative Blocks and Data Objects from Systematic Web Pages. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, 2006: p. 370-373. [21]Vadrevu, S., F. Gelgi, and H. Davulcu, Semantic partitioning of web pages. The 6th International Conference on Web Information Systems Engineering (WISE), 2005. [22]Wang, J. and F.H. Lochovsky, Data extraction and label assignment for web databases. Proceedings of the 12th international conference on World Wide Web, 2003: p. 187-196. [23]Wong., W.-c. and A.W.-c. Fu., Finding structure and characteristic of web documents for classification. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), 2000. [24]Yi, L. and B. Liu, Web Page Cleaning for Web Mining through Feature Weighting. Proceedings of Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-03), Aug, 2003: p. 9-15. [25]Yi, L., B. Liu, and X. Li, Eliminating noisy information in Web pages for data mining. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003: p. 296-305. [26]Zhai, Y. and B. Liu, Web data extraction based on partial tree alignment. Proceedings of the 14th international conference on World Wide Web, 2005: p. 76-85. [27]Zhao, H., W. Meng, Z. Wu, V. Raghavan, and C. Yu, Fully automatic wrapper generation for search engines. Proceedings of the 14th international conference on World Wide Web, 2005: p. 66-75.
|