跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.223) 您好!臺灣時間:2025/10/08 08:39
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:許鴻達
研究生(外文):Hung-Ta Hsu
論文名稱:NoSQL資料庫二級索引實現及評價之研究
論文名稱(外文):Study of Realization and Assessment of NoSQL Database Secondary Indexing
指導教授:張保榮
指導教授(外文):Bao-rong Chang
學位類別:碩士
校院名稱:國立高雄大學
系所名稱:資訊工程學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:英文
論文頁數:80
中文關鍵詞:HadoopHBaseSolr二級索引雲端運算巨量資料
外文關鍵詞:HBaseSolrSecondary IndexingCloud ComputingBig DataHadoop
相關次數:
  • 被引用被引用:0
  • 點閱點閱:504
  • 評分評分:
  • 下載下載:1
  • 收藏至我的研究室書目清單書目收藏:1
本研究的目的是實現在NoSQL資料庫結合搜尋引擎建立二級索引,並且透過二級索引去逆向查詢資料庫中的內容。在Apache多個開源計畫中,Hadoop家族為頂級計畫。於本研究中,利用了分散式檔案系統、資料庫及搜尋引擎,保留了無模式資料庫資料類型可彈性多變的優點,並且利用搜尋引擎去改善了無模式資料庫無索引的架構,並且也改善了搜尋內容的速度,最後使用者可透過相對於友善的界面去查詢,並且得到了相對應的索引後,在資料庫中在進行完整的查詢,此無形改善了整體在無模式資料庫上使用的應用性及便利性。
本研究的目的是實現在NoSQL資料庫結合搜尋引擎建立二級索引,並且透過二級索引去逆向查詢資料庫中的內容。在Apache多個開源計畫中,Hadoop家族為頂級計畫。於本研究中,利用了分散式檔案系統、資料庫及搜尋引擎,保留了無模式資料庫資料類型可彈性多變的優點,並且利用搜尋引擎去改善了無模式資料庫無索引的架構,並且也改善了搜尋內容的速度,最後使用者可透過相對於友善的界面去查詢,並且得到了相對應的索引後,在資料庫中在進行完整的查詢,此無形改善了整體在無模式資料庫上使用的應用性及便利性。
摘要 ii
ABSTRACT iii
誌謝 iv
Directory v
List of Figures vii
List of Tables ix
Chapter 1. Introduction 1
Chapter 2. Background and Related Work 4
2.1 Hadoop 9
2.2 HBase and Hive 12
2.3 Solr 17
2.4 Cassandra 19
2.5 Solandra 21
2.6 Lily Project 22
2.7 Comparison 23
Chapter 3. Research Method 26
3.1 Deploy Hadoop and HBase Clusters 27
3.2 Data Transfer for Database 27
3.2.1 Operating System Installation 27
3.2.2 Data Transfer to Hadoop 29
3.2.3 Data Transfer to HBase 30
3.3 Apache Solr Installation 33
3.4 Data Transfer to Solr 36
3.5 Secondary Indexing and Applicable Tools 39
Chapter 4. Experimental Results and Discussion 45
4.1 Hardware/Software Specification and Experiment 45
4.2 Experimental results 46
4.2.1 Time of Transferring Data from HDFS to HBase 47
4.2.2 Table Scan in Database 48
4.2.3 Time of Transferring Data from HBase to Solr 50
4.2.4 Query Performance of Secondary Indexing 52
4.2.5 Stress Test of HBase together with Solr 56
4.2.6 Power Dissipation Correlation 59
4.2.7 Performance and Cost Evaluation 61
Chapter 5. Conclusion 66
References 68
[1] S. Ghemawat, H. Gobioff, and S.T. Leung,” The google file system,” in Proceedings of the nineteenth ACM symposium on Operating systems principles, Pages 29-43, 2003.
[2] D.W. McDonald, and M.S. Ackerman, “Expertise recommender: a flexible recommendation system and architecture,” in Proceedings of the ACM conference on Computer supported cooperative work, Pages 231-240, 2000.
[3] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Communications of the ACM - 50th Anniversary Issue, vol. 51, no. 1, Pages 107-113, 2008.
[4] S. Perera, and T. Gunarathne, Hadoop MapReduce Cookbook, Packt Publishing Co, Birmingham, UK, 2008.
[5] Wikipedia, Google GFS, Big-Table, Map-reduce, NoSQL Database, RDBMS, column-oriented definition.
[6] D.J. Abadi, P.A. Boncz, and S. Harizopoulos, “Column-oriented database systems,” Journal in Proceedings of the VLDB Endowment, Vol. 2 Issue 2, Pages 1664-1665, 2009.
[7] Wikipedia, Hadoop, HBase, Hive, Cassandra definition.
[8] R. Cattell, Scalable SQL and NoSQL data stores, Newsletter ACM SIGMOD Record, Vol. 39 Issue 4, Pages 12-27, 2010.
[9] J. Xie, S. Yin, X. Ruan, and Z. Ding, “Improving MapReduce performance through data placement in heterogeneous Hadoop clusters,” in Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW) IEEE International Symposium on, Pages 1-9, 2010.
[10] D. Borthakur, J. Gray, J.S. Sarma, K. Muthukkaruppan, N. Spiegelberg, H. Kuang, K. Ranganathan, D. Molkov, A. Menon, S. Rash, R. Schmidt, and A. Aiyer, “Apache hadoop goes realtime at Facebook,” in Proceedings of the ACM SIGMOD International Conference on Management of data, Pages 1071-1080, 2011.
[11] J. Lin, and A. Kolcz, “Large-scale machine learning at twitter,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Pages 793-804, 2012.
[12] G. Juve, E. Deelman, G. B. Berriman, B. P. Berman, and P. Maechling, “An Evaluation of the Cost and Performance of Scientific Workflows on Amazon EC2,” in Journal of Grid Computing, Vol. 10, Issue 1, Pages 5-21, March 2012.
[13] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop Distributed File System,” in Mass Storage Systems and Technologies (MSST), IEEE 26th Symposium on, Pages 1-10, 2010.
[14] K. Kambatla, A. Pathak, H. Pucha, Towards Optimizing Hadoop Provisioning in the Cloud, Purdue University, USA.
[15] T. Koponen, and V. Hotti, “Open source software maintenance process framework,” in Proceedings of the fifth workshop on Open source software engineering, Pages 1-5, 2005.
[16] C. Franke, S. Morin, A. Chebotko, and J. Abraham, “Distributed Semantic Web Data Management in HBase and MySQL Cluster,” in Cloud Computing (CLOUD), IEEE International Conference on, Pages 105 - 112, 2011.
[17] N. Dimiduk, HBase in Action, Manning Publishing Co, NY, USA, 2012.
[18] Y. Jiang, HBase Administration Cookbook, Packt Publishing Co, Birmingham, UK, 2012.
[19] Bai, Jun, “Feasibility analysis of big log data real time search based on HBase and Elastic Search”, in Natural Computation (ICNC) Ninth International Conference on, Pages 1166-1170, 2013.
[20] Y. Xu, S. Hu, “QMapper: a tool for SQL optimization on hive using query rewriting,” in Companion Proceedings of the 22nd international conference on World Wide Web companion, Pages 211-212, 2013.
[21] R. Kuc, Apache Solr 4 Cookbook, Packt Publishing Co, Birmingham, UK, 2013.
[22] D. Carstoiu, A. Cernian, and A. Olteanu, “Hadoop Hbase-0.20.2 performance evaluation,” in New Trends in Information Science and Service Science (NISS) 4th International Conference on, Pages 84-87, 2010.
[23] K.Y. Cheng, Y.L. Pan, C.H. Wu, H.E. Yu, H.S. Chen, and W. Huang, “Ezilla Cloud Service with Cassandra Databasefor Sensor Observation System,” in World Academy of Science, Engineering & Technology, Issue 70, Pages 132, 2012.
[24] E. Capriolo, Cassandra High Performance Cookbook, Packt Publishing Co, Birmingham, UK, 2013.
[25] DataStax, Solandra Scaling Solr with Cassandra, USA, 2012
[26] Chimpler, Playing with Apache Hive and SOLR, USA, 2013.
[27] A. S. John, HBase – Secondary Index, Huawei Technologies co LTD, China, 2013.
[28] L. Minas, and B. Ellison, “Power Consumption in Servers,” In Dr. Dobb’s Journal, 2009.
[29] E. Capriolo, Cassandra High Performance Cookbook, Packt Publishing Co, Birmingham, UK, 2013.
[30] T. Grainger, and T. Potter, Solr in Action, Manning Publications, NY, USA, 2014.
電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊