研究生(外文):CHEN, YU-ZHEN
論文名稱:Spark 之社群網絡分析工具於交易策略之應用
論文名稱(外文):The Application of Spark and its Social Network Analysis in Trading Strategy
指導教授(外文):CHANG, YI-PING
外文關鍵詞:Pair TradingSparkSocial Network AnalysisCommunity Detection
配對交易 (pair trading) 又稱為價差交易。此交易策略運用穩定 (stationary) 時間序列具有均數回歸 (mean reversion) 的性質,將市場上兩檔具有高度相關股價走勢的股票形成一組配對股票。當此配對股票之投資組合價值偏離歷史均值,將藉由一多一空的組合交易進行套利 (arbitrage)。本論文使用台灣經濟新報 (TEJ) 提供之2016年台灣上市股票之日調整收盤價資料,藉由 Said and Dickey (1984) 提出修正之 augmented Dickey-Fuller 檢定測試一個自回歸模型 (autoregression model) 是否存在單根 (unit root),而由 Engle and Granger (1987) 的共整合檢定 (cointegration test) 測試任兩檔股票是否具有共整合性質,而考量成為配對交易之股票。由此得到大量組數的配對股票,形成一個大型的複雜網絡 (network) 系統。實際應用上,配對交易股票所形成的網絡圖可能更複雜且數據量更龐大。由於 Spark 有效提供了處理複雜圖形之演算法的工具,因此本文以 Spark 之社群網絡分析 (social network analysis) 系統 GraphFrames 為工具,採用有效且能快速取得高品質分群結果的標籤傳播演算法 (label propagation algorithm),對此大型複雜網絡進行社群發現 (community detection)。最後由配對股票的分群關係,進一步分析2016年的分群結果與個股之間的關聯。
Pair trading, also called as Spreading, is a trading strategy which can make two stocks whose prices are on a high correlation to be a pair based on the property of mean reversion in a stationary time series. Furthermore, it can be arbitraged through a portfolio with a long position and a short position as the value of the portfolio deviates from the historical mean. The thesis will apply the daily adjusted closing prices of publicly traded companies in 2016 from TEJ to test if an autoregressive model exists a unit root or not by augmented Dickey-Fuller Test of Said and Dickey (1984), and then test if any two stocks exist the property of cointegration by Cointegration Test of Engle and Granger (1987). If two stocks exist the property of cointegration, they will be assumed to be a pair. There are lots of pairs constitute a huge and complicated network system in the thesis. In fact, there are many effective algorithms in Spark to deal with complicated graphs, and that is why the thesis applies its tools, GraphFrames, to analyze the complex network by Social Network Analysis. Eventually, the thesis will use Label Propagation Algorithm to detect communities in the network, and then work on the further analysis of the result.
