( 您好!臺灣時間:2021/05/18 02:37
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):Han-Fang Cheng
論文名稱(外文):Trend Analysis of Theranostics in Big Data Derived from the National Health Insurance Research Database
指導教授(外文):Jau-Min Wong
口試委員(外文):Chung-Ming Chen
中文關鍵詞:巨量資料NoSQL資料庫MongoDBShard Key
外文關鍵詞:Big DataNoSQL databaseMongoDBShard Key
  • 被引用被引用:1
  • 點閱點閱:657
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
有別於傳統關聯式資料庫需依賴JOIN才能進行跨表單查詢的作法,非關聯式資料庫(NoSQL:Not Only SQL)具有Schema-free的資料儲存特性與Sharding的資料分片機制,因此適合被用來處理巨量資料。此外,依據Shard Key設定值執行Sharding,將巨量資料切割成小範圍區塊來加速查詢速度。即便如此,就我們所知,目前尚未有一個成熟且系統化的方式管理(包含檢索與視覺化)巨量健保資料。因此,本研究以病患歸人檔之文件導向方式儲存巨量健保資料,並探討健保資料庫所提供的欄位屬性,歸納出12項在診斷治療上的重要欄位,將此選定為Shard Key進行Sharding並執行目標查詢(Targeted Query)以提高檢索效率。並以一範例進行查詢時間效能測試,據實驗結果顯示,本研究所提出的資料處理方法確實能大幅度地縮短巨量的健保資料的查詢時間。

NoSQL (Not Only SQL) database has schema-free data format and the function of sharding. Comparing the NoSQL database with the relational database, the NoSQL database is more suitable to handle the big data. The big data is sharding into small blocks which are based on shard keys to speed up queries answering. To our knowledge, there is not yet a mature and systematic approach (including retrieval and visualization) to managing the big data derived from the National Health Insurance Research Database. Therefore, our research used patient document-oriented way to average the big medical data storage and to explore the field properties of the health insurance research database. By summarizing 12 important fields as shard keys in theranostics, before executing target queries can improve search efficiency. According to our experimental results, it shows that the proposed method of data processing can indeed significantly reduce the big data query time.

致謝 i
中文摘要 ii
Abstract iii
目錄 iv
圖目錄 vi
表目錄 viii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究流程 3
第二章 文獻探討 5
2.1 NoSQL資料庫 5
2.2 MongoDB 6
2.2.1 BSON格式 7
2.3 以健保資料為題材之相關研究 8
第三章 研究設計 13
3.1 系統架構 13
3.2 研究材料 15
3.3 研究方法 17
3.3.1 資料重製(Reformulation) 17
3.3.2 資料前處理 19
3.3.3巨量健保資料儲存格式 22
3.3.4資料分片(Sharding) 24
3.3.5選擇合適的Shard Key 26
3.3.6目標查詢(Targeted Query) 27
第四章 系統實作與展示 28
4.1 系統實作環境 28
4.2 系統功能 29
4.3 查詢效能測試 30
4.4 系統展示 32
4.4.1 情境說明 32
4.4.2 情境搜尋 33
第五章 結果與討論 43
參考文獻 45

[1] Mark A. Beyer and Douglas Laney, The Importance of ''Big Data'': A Definition, Gartner, 21 June 2012
[2] “Big data - From Wikipedia, the free encyclopedia”, http://en.wikipedia.org/wiki/Big_data
[3] B. G. Tudorica and C. Bucur, A comparison between several NoSQL databases with comments and notes, RoEduNet ''11, 1–5, 2011.
[4] Rick Cattell Cattell, Scalable SQL and NoSQL data stores, ACM SIGMOD Record, 39(4):12-27, Dec 2010
[5] Wei-ping Z. and Ming-xin L., Using MongoDB to Implement Textbook Management System instead of MySQL, ICCSN, May 27-29, 2011
[6] Pokorny, J., NoSQL databases: a step to database scalability in web environment, ACM, 2011, 278-283.
[7] M. Stonebraker, SQL databases v. NoSQL databases, Commun. ACM,vol. 53(4), 10–11, Apr. 2010
[8] Jing Han, Haihong E, Guan Le, Jian Du, Survey on NoSQL database, ICPCA 2011, 363-366, doi: 10.1109/ ICPCA.2011.6106531
[9] “10gen. MongoDB.”, http://www.mongodb.org
[10] Plugge, E., Hawkins, T., Membrey, P.: The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing, 1st edn. Apress, Berkely 2010
[11] “BSON”, http://bsonspec.org/
[12] Yimeng Liu, Yizhi Wang,Yi Jin, Research on TheImprovement of MongoDB Auto-Sharding in Cloud Environment, ICCSE 2012; doi: 978-1-4673-0242-5/12
[13] Liu YM, J Kinsey. The effect of competition on the practice of outpatient services for diabetes patients at different levels of hospitals in Taiwan. Applied Economics 2005; 37(12), 1411-1422. doi: 10.1080/00036840500118697
[14] Chang CH, Shau WY, Jiang YD, Li HY, Chang TJ, Sheu WH, Kwok CF, Ho LT, Chuang LM. Type 2 diabetes prevalence and incidence among adults in Taiwan during 1999–2004: a national health insurance data set study. Diabetic Medicine 2010; 27(6):636-643. doi: 10.1111/j.1464-5491.2010.03007.x
[15] Chung CH, Lai CH, Chu CM, Pai L, Kao S, Chien WC. A nationwide, population-based, long-term follow-up study of repeated self-harm in Taiwan. BMC Public Health 2012; 12:744. doi:10.1186/1471-2458-12-744
[16] Shen HN, Lu CL, Li CY. Epidemiology of pleural infections in Taiwan from 1997 through 2008. Respirology 2012; 17(7):1086–1093. doi: 10.1111/j.1440-1843.2012.02214.x.
[17] Ku MS, Lue KH, Sun HL. Major health-care providers and the 10 leading reasons for adolescent ambulatory visits. Pediatrics International 2012; 54, 657–662. doi: 10.1111/j.1442-200X.2012.03652.x
[18] Wei CC, Yu IW, Lin HW, Tsai AC. Occurrence of infection among children with nephrotic syndrome during hospitalizations. Nephrology 2012; 17(8):681-688. doi:10.1111/j.1440-1797.2012.01650.x.
[19] Chien IC, Lin CH, Chou YJ, Chou P. Prevalence, incidence, and stimulant use of attention-deficit hyperactivity disorder in Taiwan, 1996–2005: a national population-based study. Soc Psychiatry Psychiatr Epidemiol 2012; 47:1885–1890. Doi:10.1007/s00127-012-0501-1
[20] Hsu CC, Lee CH, Wahlqvist ML, Huang HL, Chang HY, Chen L, Shih SF, Shin SJ, Tsai WC, Chen T, Huang CT, Cheng JS. Poverty Increases Type 2 Diabetes Incidence and Inequality of Care Despite Universal Health Coverage. DIABETES CARE 2012; 35(11):2286-92. doi: 10.2337/dc11-2052
[21] Chiang CW, Chiu HF, Chen CY, Wu HL, Yang CY. Trends in the use of oral antidiabetic drugs by outpatients in Taiwan: 1997–2003. Journal of Clinical Pharmacy and Therapeutics 2006; 31(1), 73–82
[22] Wen YW, Huang WF, Lee YC, Kuo KN, Tsai CR, Tsai YW. Diffusion patterns of new anti-diabetic drugs into hospitals in Taiwan: the case of Thiazolidinediones for diabetes. BMC Health Services Research 2011; 11:21. doi:10.1186/1472-6963-11-21
[23] Chen YS, Cheng CH. Identifying the medical practice after total hip arthroplasty using an integrated hybrid approach. ComputersinBiologyandMedicine 2012; 42(8):826-840. doi: 10.1016/j.compbiomed.2012.06.006
[24] Lai SW, Muo CH, Liao KF, Sung FC, Chen PC. (2011). Risk of Acute Pancreatitis in Type 2 Diabetes and Risk Reduction on Anti-Diabetic Drugs: A Population-Based Cohort Study in Taiwan. Am J Gastroenterol 2011; 106:1697–1704; doi: 10.1038/ajg.2011.155
[25] Wang SL, Chiou JM, Chen CJ, Tseng CH, Chou WL, Wang CC, Wu TN, Chang LW. Prevalence of Non-Insulin-Dependent Diabetes Mellitus and Related Vascular Diseases in Southwestern Arseniasis-Endemic and Nonendemic Areas in Taiwan. Environ Health Perspect 2003; 111:155-159. doi:10.1289/ehp.5457
[26] Chien IC, Wu EL, Lin CH, Chou YJ, Chou P. Prevalence of diabetes in patients with major depressive disorder: a population-based study. Comprehensive Psychiatry 2012; 53(5): 569–575. DOI:10.1016/j.comppsych.2011.06.004 pp.569-75
[27] Cheng PY, Sy HN, Wu SL, Wang WF, Chen YY. Newly diagnosed type 2 diabetes and risk of dementia: A population-based 7-year follow-up study in Taiwan. Journal of Diabetes and Its Complications 2012; 26(5) 382–387. doi:10.1016/j.jdiacomp.2012.06.003
[28] Yang DC, Lee LJ, Hsu CC, Chang YY, Wang MC, Lin WH, Chang CM, Wang JD. Estimation of Expected Life-Years Saved From Successful Prevention of End-Stage Renal Disease in Elderly Patients With Diabetes. Diabetes Care 2012; 35(11):2279–2285. doi: 10.2337/dc12-0545
[29] Liu PH, Wang JD. (2008). Antihypertensive medication prescription patterns and time trends for newly-diagnosed uncomplicated hypertension patients in Taiwan. BMC Health Services Research 2008, 8:133. doi:10.1186/1472-6963-8-133
[30] Chiang CW, Chen CY, Chiu HF, Wu HL, Yang CY. Trends in the use of antihypertensive drugs by outpatients with diabetes in Taiwan, 1997–2003. Pharmacoepidemiology and Drug Safety 2007; 16(4):412-421
[31] Chang YP, Huang SK, Tao P, Chien CW. A population-based study on the association between acute renal failure (ARF) and the duration of polypharmacy. BMC Nephrology 2012; 13:96
[32] Lin CF, Shen LJ, Wu FL, Bai CH, Gau CS. Cardiovascular outcomes associated with concomitant use of clopidogrel and proton pump inhibitors in patients with acute coronary syndrome in Taiwan. Br J Clin Pharmacol 2012; 74(5):824-34. doi: 10.1111/j.1365-2125.2012.04250.x
[33] Wu CY, Chen YJ, Ho HJ, Hsu YC, Kuo KN, Wu MS, Lin JT. Association Between Nucleoside Analogues and Risk of Hepatitis B Virus–Related Hepatocellular Carcinoma Recurrence Following Liver Resection. JAMA 2012; 308(18):1906-14. doi:10.1001/2012.jama.11975
[34] Lin YH, Pan PJ. The use of rehabilitation among patients with breast cancer: a retrospective longitudinal cohort study. BMC Health Services Research 2012; 12:282. doi:10.1186/1472-6963-12-282
[35] Singh S, Chang HY, Richards TM, Weiner JP, Clark JM, Segal JB. Glucagonlike Peptide 1–Based Therapies and Risk of Hospitalization for Acute Pancreatitis in Type 2 Diabetes Mellitus: A Population-Based Matched Case-Control Study, JAMA Intern Med. 2013;173(7):534-539. doi:10.1001/jamainternmed.2013.2720
[36] Wang MT, Tsai CL, Lo YW, Liou JT, Lee WJ, Lai IC. Risk of stroke associated with inhaled ipratropium bromide in chronic obstructive pulmonary disease: A population-based nested case–control study. International Journal of Cardiology 2012, 158(2):279-284. doi:10.1016/j.ijcard.2012.02.012
[37] Lai SW, Chen PC, Liao KF, Muo CH, Lin CC, Sung FC. Risk of Hepatocellular Carcinoma in Diabetic Patients and Risk Reduction Associated With Anti-Diabetic Therapy: A Population-Based Cohort Study. Am J Gastroenterol 2012; 107(1):46–52. doi:10.1038/ajg.2011.384
[38] Lee CY, Huang KH, Lin CC, Tsai TH, Shih HC. A Neutral Risk on the Development of New-Onset Diabetes Mellitus (NODM) in Taiwanese Patients with Dyslipidaemia Treated with Fibrates. The ScientificWorld Journal 2012; 2012:392734. doi: 10.1100/2012/392734
[39] “全民健康保險研究資料庫”, http://nhird.nhri.org.tw/
[40] “行政院衛生署”, http://www.doh.gov.tw/cht2006/index_populace.aspx
[41] “行政院衛生署食品藥物管理局”, http://www.fda.gov.tw/TC/index.aspx
[42] A. Pavlo, C. Curino, and S. Zdonik. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In SIGMOD, 2012.
[43] L. Bonnet, A. Laurent, M. Sala, B. Laurent, and N. Sicard. Reduce, you say: What nosql can do for data aggregation and bi in large repositories. In Database and Expert Systems Applications (DEXA), 2011 22nd International Workshop on, pages 483 – 488, September 2011.
[44] “FDA - Januvia (sitagliptin) Tablet” , http://www.fda.gov/Safety/MedWatch/SafetyInformation/Safety-RelatedDrugLabelingChanges/ucm121926.htm
[45] 行政院衛生署新聞稿衛生署提醒醫療人員及病人含Sitagliptin成分藥品可能引起急性胰臟炎之不良反應 2009/09/26

第一頁 上一頁 下一頁 最後一頁 top