跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:8005:376a:2d98:48cd) 您好!臺灣時間:2025/01/18 09:01
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:朱洵
研究生(外文):Hsun Chu
論文名稱:使用GPGPU改善禁忌搜尋法解決工作排程問題之研究
論文名稱(外文):Using GPGPU to Improve the Tabu Search for the Permutation Flowshop Scheduling Problem
指導教授:伍朝欽伍朝欽引用關係
指導教授(外文):Chao-Chin Wu
學位類別:碩士
校院名稱:國立彰化師範大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
畢業學年度:103
語文別:中文
論文頁數:57
中文關鍵詞:禁忌搜尋法靜態確定性之序列流線型工廠排程問題排程時間GPGPUCUDA
外文關鍵詞:Tabu SearchPermutation flowshop scheduling Problemflowshop scheduling ProblemGPGPUCUDA
相關次數:
  • 被引用被引用:4
  • 點閱點閱:411
  • 評分評分:
  • 下載下載:58
  • 收藏至我的研究室書目清單書目收藏:1
本篇論文的研究主要是使用通用圖形處理器(General-purpose computing on graphics processing units, GPGPU),設計演算法的平行方式來改善禁忌搜尋演算法(Tabu Search)解決靜態確定性之序列流線型工廠排程問題(Permutation Flowshop Scheduling Problem, PFSP)。PFSP主要是要找出哪一種工作排列,也就是工作在所有機器的處理順序,可以得到最短的處理時間,讓工廠的生產線擁有高效率的產能。並且利用禁忌搜尋法有效地跳脫局部最佳解,而更快地找到全域最佳解。而禁忌搜尋法解決PFSP的特性是一種大型資料量的運算工作,十分適合GPGPU的多核心運算架構,在GPGPU上執行能大幅地縮短執行時間。本研究根據Michał Czapiński等人所提出的方法進行改善,主要著重於優化GPGPU的記憶體使用,將問題內的資料,進行化簡,減少各個執行緒讀取全域記憶體的次數,藉此達到提升效能之目的。根據實驗的結果顯示,其效能是有達到部分的改善,藉此證明此實驗為有效果的。比起Michał Czapiński以及Stuart Barnes所提出的方法,在不同種的輸入條件下最好可以提升100%效能。因此,藉由GPGPU,配合不同演算法的性質,來產生比在一般處理器上執行演算法更高的效能,處理更大量的數據資料,在面對未來大數據的時代,能有更好更快速的資料處理方式。
The purpose of this study aims at efficiently solving the Permutation Flowshop Scheduling Problem (PFSP), on the emerging General-Purpose computing on Graphics Procesing Units (GPGPU). PFSP is to find a job permutation to shorten the whole processing time of all the jobs on multiple ordered machines of different functions. Because PFSP is an NP-hard problem, it is hard to have the optimal solution within a reasonable time interval. Therefore, previous projects proposed to use Tabu search, one kind of metaheuristic algorithm, to find a nearly optimal solution for PFSP in much shorter execution time.
Recently, Michal Czapinski et al. proposed a method to solve PFSP by Tabu search on GPU. They stored all the permutations, required by one iteration of Tabu search, on global memory. Although their method could access global memory coalescingly without any branch divergence, they required large global memory space and high number of global memory accesses.
To address these problem, we propose a novel method to reduce the memory space as well as the amount of global memory access. The best performance improvement we could obtain is about 100%, compared with Czapinski’s method.

中文摘要 I
英文摘要 II
致謝 III
圖目錄 V
表目錄 VII
第一章 緒論 1
第二章 相關資訊 5
第一節 CUDA架構 5
第二節 記憶體架構特性 7
第三節 靜態確定性之序列流線型工廠排程問題 12
第四節 禁忌搜尋法 15
第五節 禁忌搜尋法解 PFSP問題 17
第六節 在GPGPU上使用禁忌搜尋法解PFSP問題 18
第七節 其它相關研究 24
第三章 改善方式 27
第一節 避免BRANCH DIVERGENCE的發生 27
第二節 改善架構推導 31
第三節 主要架構方法 38
第四節 架構驗證與還原 42
第五節 演算法特性 46
第四章 實驗結果與討論 48
第一節 工作數量的影響 49
第二節 機器數量的影響 50
第五章 結論與未來展望 53
參考文獻 54

圖目錄
圖一.簡單的GPGPU架構 6
圖二. CUDA的記憶體層次模型圖 6
圖三. Warp執行在CUDA中的架構 7
圖四. Thread存取全域記憶體的架構圖 8
圖五. 共享記憶體與bank的架構 9
圖六. 共享記憶體傳送資料的架構 9
圖七. Thread存取紋理記憶體的架構圖 10
圖八. 常數記憶體在GPU中的架構 10
圖九. Thread存取常數記憶體的架構圖 11
圖十. 每個工作在每台機器上實例的處理時間 13
圖十一. PFPS處理時間的流程實例圖 13
圖十二. PFSP的資料相依性 13
圖十三. PFPS實例的資料計算表 14
圖十四. 禁忌搜尋法的流程圖 16
圖十五. 禁忌表的實例 16
圖十六. 禁忌搜尋法應用解PFSP的流程圖 17
圖十七. 鄰近解的產生架構 18
圖十八. 鄰近解的運算架構 19
圖十九. 鄰近解的運算中共享記憶體使用方式 20
圖二十. 計算PFSP總處理時間Cmax的實例 21
圖二十一. 選擇局部最佳解的架構 22
圖二十二. 更新禁忌表與禁忌值的架構 23
圖二十三. 更新初始解的流程圖 23
圖二十四.CPU與GPGPU的Multistart禁忌搜尋法架構 24
圖二十五.以swap陣列存取C2N種不同交換index實例 27
圖二十六.利用分支架構產生工作排列的演算法 28
圖二十七. 在分支架構下,實際執行的情形 29
圖二十八. 所有工作排列存放到全域記憶體之實例 30
圖二十九. 從全域記憶體中讀取工作排列的演算法實例 30
圖三十. 從全域記憶體中讀取工作排列的實際執行狀況 31
圖三十一. 根據warp大小分割所有工作排列的實例 32
圖三十二. 如何將case 1:中的permutation分割化簡為small_permutation的實例 33
圖三十三. 在case 1中如何以small_permutation陣列協助產生2塊排列位置編號的演算法 33
圖三十四. 如何將case 2:中的permutation分割化簡為small_permutation的實例 34
圖三十五. 在case 2中如何以small_permutation陣列協助產生4塊排列位置編號的演算法 35
圖三十六. 如何將case 3:中的permutation分割化簡為small_permutation的實例 36
圖三十七. 在case 3中如何以small_permutation陣列協助產生3塊排列位置編號的演算法 36
圖三十八. 如何將case 4:中的permutation分割化簡為small_permutation的實例 37
圖三十九. 在case 4中如何以small_permutation陣列協助產生5塊排列位置編號的演算法 38
圖四十. 根據case 4分割劃分出X1、X2、X3以及X4的index實例 39
圖四十一. X1、X2、X3以及X4的index實例 40
圖四十二.根據X1、X2、X3以及X4劃分出所有工作排列的實例 41
圖四十三. 每個Thread執行工作排列的演算法 42
圖四十四. case 1.工作排列演算法對應圖 43
圖四十五. case 2.工作排列演算法對應圖 43
圖四十六. case 3.工作排列演算法對應圖 44
圖四十七. case 4.工作排列演算法對應圖 44
圖四十八. 改善更新初始解的實例 45
圖四十九. 使用GPGPU改善禁忌搜尋法解決工作排程問題之整體流程 46
圖五十 : 避免Branch Divergence的情況以及減少全域記憶體讀取次數的架構圖實例 47
圖五十一: C2050上以不同的工作個數且Thread為64的效能改善 49
圖五十二: k20上以不同的工作個數且Thread為64的效能改善 50
圖五十三 : C2050上以不同的機器個數且Thread為64的效能改善 51
圖五十四 : K20上以不同的機器個數且Thread為64的效能改善 52
圖五十五 : C2050上Thread為32的禁忌搜尋法解決PFSP六個步驟的執行時間 52

表目錄
表格一. 實驗環境C2050工作站 48
表格二. 實驗環境K20工作站 48

參考文獻
[1] http://developer.amd.com
[2] http://www.top500.org
[3] R. Vuduc, K. Czechowski, “What GPU Computing Means for High-End Systems,” IEEE Micro, Vol.31, No.4, pp. 74 – 78, 2011.
[4] F. Feinbube, et al., “Joint Forces: From Multithreaded Programming to GPU Computing,” IEEE Software, Vol.28, No.1, pp. 51 – 57, 2011.
[5] GTX 680 Kepler Whitepaper, http://www.geforce.com/Active/en_US/en_US/pdf/GeForce-GTX-680-Whitepaper-FINAL.pdf.
[6] A.R. Brodtkorb, et al., “Graphics processing unit (GPU) programming strategies and trends in GPU computing”JPDC, Vol.73, No.1, pp. 4-13, 2013.
[7] J. Cohen, M. Garland, “Novel Architectures: Solving Computational Problems with GPU Computing,” Computing in Science &; Engineering, Vol.11, No.5, pp.58 – 63, 2009.
[8] 張舒, 趙開勇, 褚豔利, 張鈺勃, GPU高效能運算之CUDA, 中國水利水電出版社, CHN, 2009.
[9] http://zh.wikipedia.org/wiki/CUDA
[10] http://developer.nvidia.com/object/gpucomputing.html
[11] http://developer.amd.com/zones/openclzone/programming/pages/default.as
[12] http://www.gputechconf.com
[13] J. Sanders, E. Kandrot, CUDA By Example an Introduction to General-Purpose GPU Programming, Pearson Education, USA, 2010
[14] S.M. Johnson, “Optimal two- and three-stage production schedules with setup times included,” Naval Research Logistics Quarterly, Vol.1, No.1, pp.61–68, 1954.
[15] J.C. Ho, “Flowshop sequencing with mean flow time objective,” European Journal of Operational Research, Vol.81, pp.571 - 578, 1995.
[16] M.R. Garey, D. Johnson, R. Sethi, “The complexity of flowshop and jobshop scheduling,” Mathematics of Operations Research, Vol.1, No.2, pp.117–129, 1976.
[17] E.K. Burke, T. Curtois, G. Post, R. Qu, B. Veltman, “A Hybrid Heuristic Ordering and Variable Neighbourhood Search for the Nurse Rostering Problem,” European Journal of Operational Research, Vol.188, pp.330–341, 2008.
[18] A. Allahverdi, T. Aldowaisan, “New heuristics to minimize total completion time in m-machine flowshops,” International Journal of Production Economics, Vol.77, pp.71-83, 2002.
[19] J.M. Framinan, R. Leisten, “An efficient constructive heuristic for flowtime minimisation in permutation flow shop,” OMEGA, Vol.31, pp.311 - 317, 2003.
[20] J.M. Framinan, R. Leisten, R. Ruiz-Usano, “Comparison of heuristics for minimisation in permutation flowshops,” Computers &; Operations Research, Vol.32, pp.1237 - 1254, 2005.
[21] C. Rajendran, H. Ziegler, “An efficient heuristic for scheduling in a flowshop to minimize total weighted flowtime of jobs,” European Journal of Operational Research, Vol.103, pp.129 - 138, 1997.
[22] E. Taillard, “Some efficient heuristic methods for the flow shop sequencing problem,” European Journal of Operations Research, Vol.47, pp.65 - 74, 1990.
[23] D.S. Woo, H.S. Yim, “A heuristic algorithm for mean flowtime objective in flowshop scheduling,” Computers and Operational Research, Vol.25, pp.175 - 182, 1998.
[24] C. Wang, C. Chu, J.-M. Proth, “Heuristic approaches for //// ∑CiFmn scheduling problems,” European Journal of Operational Research, Vol.96, pp.636-644, 1997.
[25] E.K. Burke, P. De Causmaecker and G. Vanden Berghe, “A Hybrid Tabu Search Algorithm for the Nurse Rostering Problem,”Springer Lecture Notes in Artificial Intelligence, vol.1585, Springer, pp.187–194, 1999.
[26] F. Bellanti, G. Carello, F.D. Croce, R. Tadei, “A Greedy Based Neighborhood Search Approach to a Nurse Rostering Problem,” European Journal of Operational Research, Vol.153, pp.28–40, 2004.
[27] E.K. Burke, P. De Causmaecker, S. Petrovic, G. Vanden Berghe, “Variable Neighborhood Search for Nurse Rostering Problems,” J.P. de Sousa (Eds.), Metaheuristics: Computer Decision-Making, Kluwer, pp.153–172, 2004.
[28] M. Czapinski, S. Barnes, “Tabu Search with two approaches to parallel flowshop evaluation on CUDA platform,” J.Parallel Distrib.Comput., Vol.71, pp.802 – 811, 2011.
[29] C.-S. Chung, J. Flynn, O. Kirca, “A branch and bound algorithm to minimize the total flow time for m-machine permutation flowshop problems,” International Journal of Production Economics, Vol.79, No.3, pp.185–196, 2002.
[30] X. Dong, H. Huang, P. Chen, “An iterated local search algorithm for the permutation flowshop problem with total flowtime criterion,” Computers &; Operations Research, Vol.36, No.5, pp.1664–1669, 2009.
[31] M. Ishubuchi, S. Masaki, H. Tanaka, “Modified simulated annealing for the flow shop sequencing problems,” European Journal of Operational Research, Vol.81, pp.388–398, 1995.
[32] C. Rajendran, H. Ziegler, “Two ant-colony algorithms for minimizing total flowtime in permutation flowshops,” Computers &; Industrial Engineering, Vol.48, No.4, pp.789–797, 2005.
[33] C. Rajendran, H. Ziegler, “Ant-colony algorithms for permutation flowshop scheduling to minimize makespan/total flowtime of jobs,” European Journal of Operational Research, Vol.155, No.2, pp.426 - 438, 2004.
[34] L.-Y. Tseng, Y.-T. Lin, “A hybrid genetic local search algorithm for the permutation flowshop scheduling problem,” European Journal of Operational Research, Vol.198, No.1, pp.84–92, 2009.
[35] B. Jarboui, M. Eddaly, P. Siarry, “An estimation of distribution algorithm for minimizing the total flowtime in permutation flowshop scheduling problems,” Computers &; Operations Research, Vol.36, No.9, pp.2638–2646, 2009.
[36] E. Nowicki, C. Smutnicki, “A fast tabu search algorithm for the permutation flow-shop problem,” European Journal of Operational Research, Vol,91, No.1, pp.160–175, 1996.
[37] M. Nawaz, E.E. Enscore Jr., I. Ham, “A heuristic algorithm for the m-machine, n-job flow-shop sequencing problem,” Omega, Vol.11, No.1, pp.91–95, 1983.
[38] J. Liu, C.R. Reeves, “Constructive and composite heuristic solutions to the P‖ΣCi scheduling problem,” European Journal of Operational Research, Vol.132, No.2, pp.439–452, 2001.
[39] X. Li, Q. Wang, C. Wu, “Efficient composite heuristics for total flowtime minimization in permutation flow shops,” Omega, Vol.37, No.1, pp.155–164, 2009.
[40] J. Grabowski, M. Wodecki, “A very fast tabu search algorithm for the permutation flow shop problem with makespan criterion,” Computers &; Operations Research, Vol.31, No.11, pp.1891–1909, 2004.
[41] T. Yamada, C.R. Reeves, “Solving the Csum permutation flowshop scheduling problem by genetic local search,” IEEE International Conference on Evolutionary Computation, pp. 230–234, 1998.
[42] M.F. Tasgetiren, Y.-C. Liang, M. Sevkli, G. Gencyilmaz, “A particle swarm optimization algorithm for makespan and total flowtime minimization in the permutation flowshop sequencing problem,” European Journal of Operational Research, Vol.177, No.3, pp.1930–1947, 2007.
[43] Y. Zhang, X. Li, Q. Wang, “Hybrid genetic algorithm for permutation flowshop scheduling problems with total flowtime minimization,” European Journal of Operational Research, Vol.196, No.3, pp.869–876, 2009.
[44] W. Bożejko, “Solving the flow shop problem by parallel programming,” Journal of Parallel and Distributed Computing, Vol.69, No.5, pp.470–481, 2009.
[45] W. Bożejko, M. Wodecki, “Parallel scatter search algorithm for the flow shop sequencing problem,” Lecture Notes in Computer Science, Vol.4967, pp.180–188, 2008.
[46] W. Bożejko, M. Wodecki, “Parallel genetic algorithm for the flow shop scheduling problem,” Lecture Notes in Computer Science, Vol.3019, pp.566–571, 2004.
[47] U. Aickelin, K. Dowsland, “Exploiting Problem Structure in a Genetic Algorithm Approach to a Nurse Rostering Problem,” Journal of Scheduling, Vol.3, No.3, pp. 139–153, 2000.
[48] M. Czapiński, “Parallel simulated annealing with genetic enhancement for flowshop problem with Csum,” Computers &; Industrial Engineering, Vol.59, No.4, pp.778–785, 2010.
[49] M. Czapinski, “An effective Parallel Multistart Tabu Search for Quadratic Assignment Problem on CUDA platform,” J.Parallel Distrib.Comput, Vol.73, pp. 1461–1468, 2013
[50] R.E. Burkard, S. Karisch, F. Rendl, “QAPLIB—a Quadratic Assignment Problem library,” European Journal of Operational Research, Vol.55, No.1, pp.115–119, 1991.
[51] D.T. Connolly, “An improved annealing scheme for the QAP,” European Journal of Operational Research, Vol.46, No.1, pp.93–100, 1990.
[52] V.-D. Cung, T. Mautor, P. Michelon, A. Tavares, “A scatter search based approach for the Quadratic Assignment Problem,” ICEC, Vol.97, pp.165–170, 1997.
[53] S. de Cravalho Jr., S. Rahman, “Microarray layout as Quadratic Assignment Problem,” German conference on bioinformatics, Tübingen, Germany, pp. 11–20, 2006.
[54] Z. Drezner, “A new genetic algorithm for the Quadratic Assignment Problem,” INFORMS Journal on Computing, Vol.15, No.3, pp.320–330, 2003.
[55] A.N. Elshafei, “Hospital layout as a Quadratic Assignment Problem,” Operations Research Quarterly, Vol.28, pp.167–179, 1977.
[56] L.M. Gambardella, E.D. Taillard, M. Dorigo, “Ant colonies for the Quadratic Assignment Problem,” Journal of the Operational Research Society, Vol.50, No.2, pp.167–176, 1999.
[57] E.M. Loiola, N.M.M. de Abreu, P.O. Boaventura-Netto, P. Hahn, T. Querido, “A survey for the Quadratic Assignment Problem,” European Journal of Operational Research, Vol.176, No.2, pp.657–690, 2007.
[58] P. Merz, B. Freisleben, “Fitness landscape analysis and memetic algorithms for the Quadratic Assignment Problem,” IEEE Transactions on Evolutionary Computation, Vol.4, No.4, pp.337–352, 2000.
[59] T. Stützle, “Iterated local search for the Quadratic Assignment Problem,” European Journal of Operational Research, Vol.174, No.3, pp.1519–1539, 2006.
[60] T. Stützle, M. Dorigo, “ACO algorithms for the Quadratic Assignment Problem,” in: D. Corne, M. Dorigo, F. Glover (Eds.), New Ideas in Optimization, McGraw- Hill, pp.33–50, 1999.
[61] E. Taillard, “Robust taboo search for the Quadratic Assignment Problem,” Parallel Computing, Vol.17, No.4–5, pp.443–455, 1991.
[62] W. Zhu, J. Curry, A. Marquez, “SIMD tabu search for the Quadratic Assignment Problem with graphics hardware acceleration,” International Journal of Production Research, Vol.48, No.4, pp.1035–1047, 2010.
[63] J. Tabithas, R. César, F. Glover,“Multistart Tabu Search and Diversification Strategies for the Quadratic Assignment Problem,” IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, Vol.39, No.3, MAY 2009
[64] J. Tabitha, V. Tech, R. Cesar, “Path Relinking with Multi-Start Tabu Search for the Quadratic Assignment Problem,” Journal International Journal of Swarm Intelligence Research archive, Vol.2, Issue 2, pp.52 - 70, April 2011.
[65] F. Glover, “A template for scatter search and path relinking,” Third European Conference on Artificial Evolution, AE’97, Springer- Verlag, London, UK, pp.3–54, 1998.
[66] J. Szymon, Z. Dominik, “Solving Multi-criteria Vehicle Routing Problem by Parallel Tabu Search on GPU,” International Conference on Computational Science, Vol.18, pp.2529 – 2532, 2013.
[67] L. Bukata, P. Šůcha, Z. Hanzálek, “Solving the Resource Constrained Project Scheduling Problem using the parallel Tabu Search designed for the CUDA platform,” J. Parallel Distrib. Comput., 2014.
I. Chakroun, N. Melab, M. Mezmaz, D. Tuyttens, “Combining multi-core and GPU computing for solving combinatorial optimization problems,” J. Parallel Distrib. Comput., Vol.73, pp1563 – 1577, 2013.

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊