跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.91) 您好!臺灣時間:2025/02/19 18:55
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張國斌
研究生(外文):CHEONG KUOK PAN
論文名稱:透過POI的過期驗證以持續維護POI資料庫
論文名稱(外文):Sustainable POI Database Maintenance via Outdated POI Verification
指導教授:張嘉惠張嘉惠引用關係
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:37
中文關鍵詞:基於位置的服務興趣點監督式學習
外文關鍵詞:Location-based servicePoint of interestSupervised learning
相關次數:
  • 被引用被引用:0
  • 點閱點閱:241
  • 評分評分:
  • 下載下載:19
  • 收藏至我的研究室書目清單書目收藏:0
隨著智慧行動設備的普及率快速提升,查詢店家、地點等POI(Point of Interest)資訊的服務也變成大家的日常所需,提供這種服務的背後需要有一個龐大的POI資料庫。在經過一段時間之後,這些資料庫的POI資料就不一定是最新的。如果使用者得到錯誤的資訊,將會浪費他寶貴的時間。所以如何讓POI資料庫保持在最新的狀態就成了一門關鍵的課題。我們希望透過持續更新資料庫,識別出已經停止營運的POI,從而提供正確的POI資訊。
由於來自黃頁的POI資料庫的資料量過於龐大,很難有效地使用人工的方式進行更新驗證,而政府有大量的開放資料是由眾多業者共同維護的。其中「全國營業(稅籍)登記資料集」和「公司解散登記清冊」可以被我們使用。然而,開放資料集的資料格式與一般的POI資料庫不同也需要小心處理。除此之外,網路上有豐富的資料量可以提供我們使用。利用網路上的資訊,例如網頁更新日期、網路上的聲量等資料來訓練驗證模型,檢測資料庫中可能過期的POI。
在本論文中,我們的系統目標在於在可行的時間內偵測資料庫內過期的POI。方法分為兩個部分。第一部分為政府開放資料的使用,找出POI資料庫與開放資料共同擁有的POI以直接更新其狀態;第二部分則是利用網路資訊訓練POI過期驗證模型,偵測資料庫內已經過期的POI。實驗結果顯示採用Google地圖資訊、與上次有消息的時間差、是否還出現在官網上、描述POI過期的詞彙等資料可達到F度量0.758,透過特徵組合可達到F度量0.91,比起Chuang等人模型提升F度量0.201。
With the increase usage of mobile phones, the demand of searching POI (Point of Interest), such as store, address, etc., is becoming part of people's daily life. Providing such services needs a massive POI database. However, the POI information for such a database may change as time passing. It’s annoying for user to get wrong information. How to keep the POI database up to date by continuously identifying outdated POIs and updating the database has become a key issue.
As the POI database grows, it is difficult to effectively use the manual way to verify the data. Yet the government has open data regarding business, e.g. “全國營業(稅籍)登記資料集” and “公司解散登記清冊”. However, the data should be used carefully since the data format of the open data set is different with general POI database used in may service On the other hand, there is rich and available information on the web. Using the information on the web, such as the date that the web page is updated, the volume of POI mentioned on the web, we can train a verification model to detect POIs that may be outdated in the database.
In this paper, our goal is to detect outdated POIs in the database within a feasible time. The approach can be divided to two parts. The first part is using open government information. The second part is using Web information to train a model to detect outdated POI in the database. Experiments show that our performance can achieve 0.758 F-measure (by using google map information, time distance between today and recent publishing date, appear on official website or not, words about outdated POI description), best performance can be reached to 0.91 F-measure by feature combination, it's higher than Chuang 0.201.
摘要.....................i
Abstract................ii
目錄.....................iii
圖目錄...................iv
表格目錄..................v
1. 緒論...................1
2. 相關研究................4
2.1 POI資料的比對..........4
2.2 過期POI的偵測方法......4
3. 系統架構與方法..........9
3.1 政府開放資料的利用.....10
3.2 POI驗證模型的建立......11
4. 實驗...................18
4.1 資料集..............18
4.2 評估................21
5 總結.................27
6 未來工作..............28
參考....................29
[1]Chuang, H. M., & Chang, C. H. (2015, May). Verification of poi and location pairs via weakly labeled web data. In Proceedings of the 24th International Conference on World Wide Web (pp. 743-748).
[2]Al-Bahadili, H., Qtishat, H., & Naoum, R. S. (2013). Speeding up the Web Crawling process on a Multi-core processor using Virtualization. International Journal on Web Service Computing, 4(1), 19.
[3]Chuang, H. M., Chang, C. H., & Kao, T. Y. (2014, September). Effective web crawling for chinese addresses and associated information. In International Conference on Electronic Commerce and Web Technologies (pp. 13-25). Springer International Publishing.
[4]Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992, July). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory (pp. 144-152). ACM.
[5]Chuang, H. M., Chang, C. H. (2016). POI Extraction and Relation Verification from the Web [Chuang, NCU, PhD Thesis]
[6]Platt, J. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines.
[7]Tran, T., & Cao, T. H. (2013). Automatic Detection of Outdated Information in Wikipedia Infoboxes. Research in Computing Science, 70, 211-222.
[8]Hu, Y., Janowicz, K., & Prasad, S. (2014, November). Improving Wikipedia-based place name disambiguation in short texts using structured data from DBpedia. In Proceedings of the 8th workshop on geographic information retrieval (p. 8). ACM.
[9]Lin Y. Y., Chang, C. H. (2014) Store Name Extraction and Name-Address Matching for Geographic Information Retrieval [Lin, NCU, masters Thesis]
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top