跳到主要內容

臺灣博碩士論文加值系統

(44.201.72.250) 您好!臺灣時間:2023/09/25 00:45
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:宋立淳
研究生(外文):Li-Chun Song
論文名稱:HSQL: 具高擴展性的雲端線上交易處理資料庫系統
論文名稱(外文):HSQL: A Highly Scalable Cloud Database for OLTP Query Processing
指導教授:劉邦鋒
指導教授(外文):Pangfeng Liu
口試委員:洪士灝
口試委員(外文):Shih-Hao Hung
口試日期:2013-07-30
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:27
中文關鍵詞:雲端運算SQL在HBaseOLTP資料處理行索引平行指令處理
外文關鍵詞:Cloud ComputingSQL on HBaseOLTP Data ProcessingColumn IndexingParallel Query Processing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:334
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
傳統的關聯式資料庫系統遇到巨量資料時,會有擴展性的問題,為了能夠有效率
的處理這些巨量的資料,許多NoSQL的資料庫已經被開發出來,然而,許多的功能,像是SQL的介面、多個列的交易處理以及次級索引,在NoSQL的資料庫裡並不支援。因此,在NoSQL的資料庫裡,提供這些功能已經變成一個重要的研究議題。

在這篇論文中,我們發展了HSQL,一個具有高擴展性的OLTP資料庫系統。HSQL使用HBase當作底層的分散式儲存系統,因此具有HBase的高擴展性。為了能夠處理OLTP的工作,我們提出了一個創新的方法,在HBase上實作多個列的交易處理。另外,我們也在HBase上設計了一個分散式的次級索引,我們實驗結果顯示我們的資料庫系統在處理巨量資料時,相較於MySQL有較好的性能。

With the ever-increasing large amounts of data, traditional relational database management systems may suffer limited scalability. To be able to process large amounts of data, many NoSQL databases have been developed. However, many features, such as SQL interface, multi-row transactions,and secondary index support, are unavailable in the NoSQL databases. Providing these features for NoSQL databases has become an important research issue.

In these thesis, we develop HSQL, a highly scalable database for OLTP applications. HSQL uses HBase as the underlying distributed data store and interits the scalability of HBase. To be able to process OLTP workload, we propose a novel approach to support multi-row transactions on HBase. In addition, we also devise a distributed secondary index scheme for HBase. The
experiment results show that HSQL has better performances than MySQL on large scale of data.

Acknowledgement i
Chinese Abstract ii
Abstract ii
1 Introduction 1
2 Related Work 4
2.1 Secondary Index Schemes on HBase . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Transaction Management Systems on Distributed Data Stores . . . . . . . . . . . 5
2.3 NoSQL Data Management Systems with SQL-like Interface . . . . . . . . . . . 6
3 Overview of HBase 7
4 System Architecture of HSQL 10
4.1 System Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.1 HSQL Table Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.2 HSQL Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.3 Local transaction Manager . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.4 Secondary Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.5 HBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 System Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Transaction Management 13
6 Distributed B-Tree Column Indexing 16
7 Experiment Results 19
7.1 TPC-C Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.2 Experiment Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.3 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.3.1 Performance on Different Database Sizes . . . . . . . . . . . . . . . . . 20
7.3.2 Performance on Different Numbers of Concurrent Users . . . . . . . . . 22
7.3.3 Speed Up of Secondary Index . . . . . . . . . . . . . . . . . . . . . . . 24
8 Conclusion 25

[1] Mysql. http://www.mysql.com.
[2] Oracle. http://www.oracle.com.
[3] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters.
Commun. ACM, 51(1):107–113, January 2008.
[4] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. SIGOPS
Oper. Syst. Rev., 37(5):29–43, October 2003.
[5] Hadoop. http://hadoop.apache.org/.
[6] Amazon. http://aws.amazon.com/s3.
[7] Dropbox. https://www.dropbox.com.
[8] Tumblr. https://www.tumblr.com.
[9] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike
Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A distributed
storage system for structured data. ACM Trans. Comput. Syst., 26(2):4:1–4:26, June 2008.
[10] HBase. http://hbase.apache.org.
[11] Avinash Lakshman and Prashant Malik. Cassandra: a decentralized structured storage system.
SIGOPS Oper. Syst. Rev., 44(2):35–40, April 2010.
[12] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash
Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, andWerner Vogels.
Dynamo: amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev., 41(6):205–
220, October 2007.
[13] Mongo. http://www.mongodb.org.
[14] ITHBase. https://github.com/hbase-trx/hbase-transactional-tableindexed.
[15] IHBase. https://github.com/ykulbak/ihbase.
[16] Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. G-store: a scalable data store for
transactional multi key access in the cloud. In Proceedings of the 1st ACM symposium on
Cloud computing, SoCC ’10, pages 163–174, New York, NY, USA, 2010. ACM.
[17] Zhou Wei, Guillaume Pierre, and Chi hung Chi. Cloudtps: Scalable transactions for web
applications in the cloud. Technical report, 2010.
[18] Chen Zhang and H. De Sterck. Supporting multi-row distributed transactions with
global snapshot isolation using bare-bones hbase. In Grid Computing (GRID), 2010 11th
IEEE/ACM International Conference on, pages 177–184, 2010.
[19] Chen Zhang and Hans De Sterck. Hbasesi: Multi-row distributed transactions with global
strong snapshot isolation on clouds. Scalable Computing: Practice and Experience, 12(2),
2011.
[20] F. Junqueira, B. Reed, and M. Yabandeh. Lock-free transactional support for large-scale storage
systems. In Dependable Systems and Networks Workshops (DSN-W), 2011 IEEE/IFIP
41st International Conference on, pages 176–181, 2011.
[21] Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins.
Pig latin: a not-so-foreign language for data processing. In Proceedings of the 2008 ACM
SIGMOD international conference on Management of data, SIGMOD ’08, pages 1099–
1110, New York, NY, USA, 2008. ACM.
[22] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony,
Hao Liu, Pete Wyckoff, and Raghotham Murthy. Hive: a warehousing solution over a
map-reduce framework. Proc. VLDB Endow., 2(2):1626–1629, August 2009.
[23] Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel Abadi, Avi Silberschatz, and Alexander
Rasin. Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical
workloads. Proc. VLDB Endow., 2(1):922–933, August 2009.
[24] Meng-Ju Hsieh, Chao-Rui Chang, Li-Yung Ho, Jan-Jan Wu, and Pangfeng Liu. Sqlmr : A
scalable database management system for cloud computing. In Parallel Processing (ICPP),
2011 International Conference on, pages 315–324, 2011.
[25] HBQL. http://www.hbql.com.
[26] Phoenix. http://phoenix-hbase.blogspot.tw.
[27] Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed. Zookeeper: waitfree
coordination for internet-scale systems. In Proceedings of the 2010 USENIX conference
on USENIX annual technical conference, USENIXATC’10, pages 11–11, Berkeley,
CA, USA, 2010. USENIX Association.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top