跳到主要內容

臺灣博碩士論文加值系統

(44.200.122.214) 您好!臺灣時間:2024/10/07 21:47
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張維元
研究生(外文):Chang, Wei Yuan
論文名稱:探討空氣污染指標與癌症統計資料關聯之資料驅動分析框架
論文名稱(外文):A Data-driven Framework on Correlating Air Pollution Indices and Cancer Statistics
指導教授:陳良弼陳良弼引用關係
指導教授(外文):Arbee L.P. Chen
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:40
中文關鍵詞:空氣污染癌症統計健康醫療資料分析資料驅動資料即服務
外文關鍵詞:air pollutioncancer statisticshealth caredata analysisdata drivendata as a service
相關次數:
  • 被引用被引用:2
  • 點閱點閱:552
  • 評分評分:
  • 下載下載:51
  • 收藏至我的研究室書目清單書目收藏:1
根據世界衛生組織公佈的全球健康風險報告,環境議題是世界上急需被解決的問題之一。特別是,空氣污染將傷害人體健康。在這個研究中,我們建立一個從資料收集到產生知識的整體分析框架用來探討空氣污染指數和癌症統計資料之間的關係。該框架由兩個部分組成,包括資料存取流程與資料分析流程。資料存取流程用來提升原始(開放)資料的可存取性,並透過 API釋出。在資料處理中,我們利用時間和位置資訊將癌症統計資料對應到最近的空氣品質監測站,並侷限在一個有效的影響範圍內。資料分析流程基於資料驅動的概念,使用資料探索及資料探勘的方法發現資料間的關係。資料探索方法使用統計、叢集和序列化分析的技術,初步發現資料間存在的關聯。然後,引入分類器進一步分析空氣污染指數和癌症統計之間的關係。實驗結果表明,不同的空氣污染指標在特定的癌症上有顯著的影響。使用醫學領域上的論文作為評估,我們所找出的結果與傳統的統計方法一致,並可同時包含多個研究的結果。總體來說,此框架除了空氣汙染與癌症資料外,亦可應用在其他同時具有空間與時間之資料集。

According to the Global Health Risks Report, published by WHO, environmental issues are urged to be solved in the world. Especially, air pollution causes great damage to human health. In this work, we build an analysis framework for finding the implications between air pollution indices and cancer statistics. This framework consists of two parts for data access and data analytics, including data access flow and analytics flow, respectively. The data access flow is designed to process raw (open) data to be accessed by APIs. We map the cancer statistics to the air pollution data in the nearest monitoring stations through time and location information. The analytics flow is used to find the insights based on data exploration methods and data mining methods. The exploration methods use statistics, clustering, and series mining techniques to interpret data at hand. Then, classifiers are applied to find the relationships between air quality and cancer diseases by viewing air pollution indices and cancer statistics as features and labels, respectively. The experiments show which air pollutant has significant influence on the specific cancer. In addition, the results identified are consistent with those by traditional statistical methods. Moreover, the results achieved can also cover those by several studies. In summary, this framework is flexible and can be applied globally to other spatiotemporal data.
Acknowledgement i
Abstract ii
摘要 iii
Content iiii
List of Figures vi
List of Tables vii
1. Introduction 1
2. Related works 4
3. System Infrastructure and Analytic Flowchart 8
4. Data Access 10
4.1 Data-as-a-Service (DaaS) 10
4.2 Data Description 12
4.2.1 Air Quality Monitoring Data 13
4.2.2 Cancer Occurrence statistical Data 14
4.3 Data Preprocessing 16
5. Data Exploration 18
5.1 Statistical Analysis 18
5.2 Row-wise Analysis 19
5.2.1 Clustering Method 19
5.2.2 Clustering Representation 20
5.3 Column-wise Analysis 20
5.3.1 Time Serialization 20
5.3.2 Correlation 21
5.3.3 Correlation Matrix 21
5.4 Results and Observations 21
5.4.1 Statistical Analysis 21
5.4.2 Row-wise Analysis 22
5.4.3 Column-wise Analysis 25
5.4.4 Observations 28
6. Data Mining 29
6.1 Classification 29
6.1.1 Classification 29
6.1.2 Interpretation 30
6.1.3 Application 30
6.2 Results and Evaluations 30
6.2.1 Results 30
6.2.2 Evaluations 33
7. Conclusion 36
8. References 38
[1] J. L. Bentley, "Multidimensional binary search trees used for associative
searching," Communications of the ACM, vol. 18, pp. 509-517, 1975.
[2] P. Colpaert, S. Joye, P. Mechant, E. Mannens, and R. Van de Walle, "The 5 stars of
open data portals," International Conference on Methodologies, Technologies and
Tools Enabling e-Government, pp. 61-67, 2013.
[3] J. C. Bertot, P. T. Jaeger, and J. M. Grimes, "Using ICTs to create a culture of
transparency: E-government and social media as openness and anti-corruption tools for
societies," Government Information Quarterly, vol. 27, pp. 264-271, 2010.
[4] P. Y. Chang, “A study on association between male lung cancer incidence trends
analysis by histologic types and air pollution in Taiwan,” PHD dissertation in
Department of Information Management at National Sun Yat-sen University, Taiwan,
2013.
[5] D. L. Crouse, M. S. Goldberg, N. A. Ross, H. Chen, and F. Labrèche,
"Postmenopausal breast cancer is associated with exposure to traffic-related air
pollution in Montreal, Canada: a case-control study," Environmental Health
Perspectives, vol. 118, pp. 1578, 2010.
[6] M.-S. Dao and K. Zettsu, "Discovering Environmental Impacts on Public Health
Using Heterogeneous Big Sensory Data," in 2015 IEEE International Congress on Big
Data, 2015, pp. 741-744.
[7] D. Delen, C. Fuller, C. McCann, and D. Ray, "Analysis of healthcare coverage: A
data mining approach," Expert Systems with Applications, vol. 36, pp. 995-1003, 2009.
[8] R. A. Dicken, S. A. M. F. Rubby, S. Naz, A. M. A. Khaled, S. A. Rahman, S.
Rahman, "Analysis and classification of respiratory health risks with respect to air
pollution levels," in 16th IEEE/ACIS International Conference on Software
Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing
(SNPD’15), 2015, pp. 1-6.
[9] E. Fotopoulou, A. Zafeiropoulos, D. Papaspyros, P. Hasapis, G. Tsiolis, T. Bouras,
"Linked Data Analytics in Interdisciplinary Studies: The Health Impact of Air Pollution
in Urban Areas," IEEE Access, vol. 4, pp. 149-164, 2016.
[10] H.-P. Hsieh, S.-D. Lin, and Y. Zheng, "Inferring air quality for station location
recommendation based on urban big data," in Proceedings of the 21th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, 2015, pp. 437-
446.
[11] P. Hystad, P. A. Demers, K. C. Johnson, R. M. Carpiano, and M. Brauer, "Longterm
residential exposure to air pollution and lung cancer risk," in Epidemiology, vol.
24, pp. 762-772, 2013.
[12] P. Hystad, P. J. Villeneuve, M. S. Goldberg, D. L. Crouse, K. Johnson, and G.
Canadian Cancer Registries Epidemiology Research, "Exposure to traffic-related air
pollution and the risk of developing breast cancer among women in eight Canadian
provinces: a case–control study," in Environment International, vol. 74, pp. 240-248,
2015.
[13] R. J. Kuo, S. Y. Lin, and C. W. Shih, "Mining association rules through integration
of clustering analysis and ant colony system for health insurance database in Taiwan,"
in Expert Systems with Applications, vol. 33, pp. 794-808, 2007.
[14] L. L. Liu, “The Effects of Air pollution on Breast Cancer
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top