跳到主要內容

臺灣博碩士論文加值系統

(3.231.230.177) 您好!臺灣時間:2021/07/27 12:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:黃柏傑
研究生(外文):Pai-Chieh Huang
論文名稱:以群集式基因演算法為基礎的時間序列切割法與型樣之發掘
論文名稱(外文):A Cluster-based Genetic Approach for Segmentation of Time Series and Pattern Discovery
指導教授:曾新穆曾新穆引用關係
指導教授(外文):Shin-Mu Tseng
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:中文
論文頁數:68
中文關鍵詞:時間序列切割遺傳演算法離散小波轉換叢集
外文關鍵詞:time series segmentationgenetic algorithmclusteringdiscrete wavelet transformation
相關次數:
  • 被引用被引用:2
  • 點閱點閱:228
  • 評分評分:
  • 下載下載:58
  • 收藏至我的研究室書目清單書目收藏:0
時間序列資料指每隔一定時間將所觀察的目標記錄而成的資料。時間序列之應用層面廣泛,如經濟中的股市交易日收盤價、醫學常用的各種生理訊號、多媒體的影音資料等。因此,時間序列之研究更顯的有趣而重要;針對時間序列資料,現有異常偵測、象徵符號代表方法、相似度測量比對、維度化簡、切割等相關研究。本論文對於切割這個研究方向,結合叢集技術、離散小波轉換與遺傳演算法得以將時間序列自動切割及型樣發掘,進行時間序列切割法的改良,並降低之前的方法在利用離散小波轉換所帶來失真之問題與減少發掘出較無意義型樣。本方法首先將時間序列切割的位置編成染色體,其中兩個基因間所代表的為一線段(segment)。在評估函數中,先對染色體所切割形成的線段利用k-means叢集技術進行分群,之後搭配離散小波轉換計算該切割結果之相似程度與計算該切割結果所形成之合適度(suitability)進行適合度的計算。其中染色體的合適度是用來降低單線段成為一種型樣之機會與離散小波轉換所造成失真問題。因此,本論文所提的方法不僅能有同時進行時間序列切割與自動搜尋型樣之優點,而能更進一步提升型樣之可靠性與減少壓縮失真所帶來的誤差。為了驗證本方法之可行性,我們使用台灣股市中的鋼鐵類股與半導體類股收盤價之時間序列資料進行實驗。實驗結果顯示,本論文所提之方法,能從時間序列中有效的自動切割出已定義之型樣與未被定義之型樣的線段。
A time series is composed of lots of data points, each of which represents a value at a certain time. Many phenomena can be represented by time series, such as electrocardiograms in medical science, gene expressions in biology and video data in multimedia. Time series have thus been an important and interesting research field due to their frequent appearance in different applications. It is related to many research topics, including anomaly detection, similarity measurement, dimension reduction and segmentation, among others. In this thesis, we proposed a time series segmentation approach by combining the clustering technique, the discrete wavelet transformation and the genetic algorithm to automatically find segments and patterns from a time series and reduce the raised problems in previous approach. The first one is that it may cause distortion of segments when using the discrete wavelet transformation (DWT) to adjust the length of the subsequences. The second one is that if a group contains only one segment then it may result in a less meaningful pattern. The proposed approach first divides the segments in a chromosome into k groups according to their slopes by using clustering techniques. In order to deal with these problems, two factors, namely the density factor and the distortion factor, are used to solve them. The distortion factor is used to avoid the distortion of the segments and the density factor is used to avoid generation of meaningless patterns. The fitness value of a chromosome is then evaluated by the distances of segments and these two factors. Experimental results on real financial datasets in Taiwan also show the effectiveness of the proposed approach.
英文摘要 I
中文摘要 II
誌謝 III
目錄 IV
表目錄 VI
圖目錄 VII
第1章 簡介 1
1.1 研究背景 1
1.2 研究動機 2
1.3 問題描述 3
1.4 研究方法 4
1.5 論文架構 5
第2章 文獻探討 6
2.1 時間序列切割法 6
2.1.1 迴歸式切割法 6
2.1.2 型樣式切割法 9
2.1.3 其他切割法 12
2.2 叢集方法 12
2.3 相似度量測方法 15
2.3.1 距離量測 15
2.3.2 相關係數 16
2.4 小波轉換 17
2.5 遺傳演算法 18
2.5.1 遺傳演算法的意涵 19
2.5.2 遺傳演算法的架構 21
第3章 研究方法 23
3.1 群集式遺傳演算法為基礎的時間序列切割法架構 23
3.2 遺傳演算法之設置 24
3.2.1 染色體表示法 25
3.2.2 初始化母體 26
3.2.3 線段分群 27
3.2.4 評估函數與選擇運算子 28
3.2.5 基因運算子 32
3.3 所提之演算法 34
3.4 範例 35
第4章 實驗分析 38
4.1 實驗資料說明 38
4.2 實驗設計 39
4.3 實驗結果 40
4.3.1 時間序列切割結果觀察 40
4.3.1.1 評估值探討 41
4.3.1.2 切割結果探討 43
4.3.2 參數值設定的影響 45
4.3.2.1 線段斜率表示數(numSlopes) 46
4.3.2.2 平均初始區間長度(InitialDesired) 48
4.3.2.3 密度評估因子與失真度評估因子權重 51
4.3.3 類股的型樣分析 54
4.3.3.1 CAST(Cluster Affinity Search Technique)之前置作業 55
4.3.3.2 CAST之參數設置與結果 56
4.4 實驗總結 61
第5章 結論與未來研究方向 62
5.1 結論 62
5.2 應用與研究方向 63
參考文獻 64
作者自述 68
[1]J. Aach and G. Church, “Aligning gene expression time series with time warping algorithms,” Bioinformatics, Vol. 17, pp 495-508, 2001.
[2]B. D. Amir, R. Shamir & Z. Yakhini,Clustering gene expression patterns, Journal of Computational Biology, 1999
[3]R. Bellman, "On the approximation of curves by line segments using dynamic programming," Communications of the ACM, Vol. 4, No. 6, pp. 284, 1961.
[4]J. R. Chen, “Making subsequence time series clustering meaningful,” The IEEE International Conference on Data Mining, pp. 114-121, 2005.
[5]P. Cohen and N. Adams, ”Unsupervised Segmentation of Categorical Time Series into Episodes,” Proceeding of IEEE International Conference on Data Mining, pp. 99-106, 2002.
[6]F. L. Chung, T. C. Fu, R. Luk and V. Ng, “Evolutionary time series segmentation for stock data mining,” The IEEE International Conference on Data Mining, pp. 83 - 90, 2002.
[7]F. L. Chung, T. C. Fu, V. Ng and R. W. P. Luk, “An evolutionary approach to pattern-based time series segmentation,” IEEE Transactions on Evolutionary Computation, Vol. 8, No. 5, pp. 471- 489, 2004.
[8]S.R. Duncan and G. F. Bryant,” A new algorithm for segmenting data from time series,” Proc. 35th IEEE Conf. Decsion Control, 1996.
[9]S. Erdal, O. Ozturk, D. Armbruster, H. Ferhatosmanoglu and W.C. Ray, “A time series analysis of microarray data,” The IEEE Symposium on Bioinformatics and Bioengineering, pp. 366-375, 2004.
[10]T. C. Fu, F. L. Chung, V. Ng and R. Luk, “Evolutionary segmentation of financial time series into subsequences,” The Congress on Evolutionary Computation, Vol. 1, pp. 426 - 430, 2001.
[11]L. Feng, K. Ju and K. H. Chon, "A method for segmentation of switching dynamic modes in time series," IEEE Transactions on Systems, Man and Cybernetics, Part B, Vol. 35, No. 5, pp. 1058-1064, 2005.
[12]C. L. Fancoua and J. C. Principe, “A Neighborhood Map of Competing One Step Predictors for Piecewise Segmentation and Identification of Time Series,” IEEE Internal Conference on Neural Networks, Vol. 4, pp.1906-1911, 1996.
[13]V. Guralnik and J. Srivastava, "Event detection from time series data," Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 33-42, 1999.
[14]J. Han, G. Dong and Y. Yin, “Efficient Mining of Partial Periodic Patterns in Time Series Database,” The Internal Conference on Data Engineering, pp. 106-115, 1999.
[15]J. Himberg, K. Korpiaho, H. Mannila,J. Tikanmaki and H.T.T. Toivonen, "Time series segmentation for context recognition in mobile devices," The IEEE International Conference on Data Mining, pp. 203-210, 2001.
[16]Y. W. Huang and P. S. Yu, “Adaptive Query Processing for Time-Series Data,” in Proceeding of the 5th Int’l Conference on Knowledge Discovery and Data Mining, pp. 282-286, San Diego, CA, Aug 15-18, 1999.
[17]Jian Jiang, ZHE Zhang, and Huaiqing Wang, “A New Segmentation Algorithm to Stock Time Series based on PIP Approach,” Wireless Communications, Networking and Mobile Computing, 2007. WiCom 2007. International Conference
[18]Anil K. Jain and Richard C. Dubes, ”Algorithms for Clustering Data,” Prentice Hall, 1988.
[19]E. Keogh, “http://www.cs.ucr.edu/~eamonn/discords/”, 2005.
[20]E. Keogh, S. Chu, D. Hart and M. Pazzani, “An online algorithm for segmenting time series,” The IEEE Internal Conference Data Mining, pp. 289-296, 2001.
[21]E. Keogh, J. Lin and A. Fu, “Hot SAX: efficiently finding the most unusual time series subsequence,” The IEEE International Conference on Data Mining, pp. 27-30, 2005.
[22]E. Keogh, J. Lin and W. Truppel, “Clustering of time series subsequences is meaningless: implications for previous and future research,” The IEEE International Conference on Data Mining, pp. 115 – 122, 2003.
[23]A. Lendasse, J. Lee, E. de Bodt, V. Wertz and M. Verleysen, "Dimension reduction of technical indicators for the prediction of financial time series - Application to the BEL20 Market Index," European Journal of Economic and Social Systems, Vol. 15, No. 2, pp. 31-48, 2001.
[24]J. B. McQueen, “Some methods of classification and analysis of mutivariate observations,” The Symposium on Mathematical Satistics and Probability, 1967, pp. 281-297.
[25]Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, 1994.
[26]J. J. Oliver, R. A. Baxter and C. S. Wallace, “Minimum Message Length Segmentation,” The. Pacific-Asia Conference Knowledge Discovery Data Mining, pp. 222-233, 1998.
[27]W. Penny and S. Roberts, “Dynamic models for nonstationary signal segmentation,” Computers and Biomedical Research, Vol. 32, No. 6, pp. 483-502, 1999.
[28]Donald B. Percival and Andrew T. Walden. “Wavelet Methods for Time Series Analysis,” Published by Cambridge University Press,24 Jul 2000.
[29]V. S. Tseng, C. H. Chen, C. H. Chen and T. P. Hong, “Segmentation of time series by the clustering and genetic algorithms,” The Workshop on Foundation of Data Mining and Novel Techniques in High Dimensional Structural and Unstructured Data in The Fifth IEEE International Conference on Data Mining, pp. 443-447, 2006.
[30]J. P. C. Valente and I. L. Chavarrias, “Discovering similar patterns in time series,” Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 497-505, 2000.
[31]X. Y. Wang and Z. O. Wang, “A Structure-Adaptive Piece-Wise Linear Segments Representation for Time Series,” Proceeding of IEEE International Conference on Information Reuse and Integration, pp. 433 - 437, 2004.
[32]http://tw.stock.yahoo.com/
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊