跳到主要內容

臺灣博碩士論文加值系統

(98.82.120.188) 您好!臺灣時間:2024/09/15 16:17
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蕭佑哲
研究生(外文):YU-CHEHSIAO
論文名稱:低延遲雙通道排序電路設計與實現
論文名稱(外文):Design and implementation of a low latency dual path sorter
指導教授:陳培殷陳培殷引用關係
指導教授(外文):Pei-Yin Chen
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:中文
論文頁數:31
中文關鍵詞:90-nm TSMCcomparison freesorting algorithmsspeed complexity O(N)
外文關鍵詞:90-nm TSMCcomparison freesorting algorithmsspeed complexity O(N)
相關次數:
  • 被引用被引用:0
  • 點閱點閱:116
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
由於排序在應用上頻繁的使用到,所以快速,面積小的硬體排序一直以來都是重要的研究議題,但是先前的作品其可擴展性都不佳,也就是當排序的輸入數字大幅上升時必須要面對到需要大幅修改架構、成本與排序時間非線性成長。因此本論文提出成本與速度為O(N)的非比較型排序演算法,使其較前人做人成本更低,速度更快。特別是在速度上,本論文不只做到工作頻率更快,同時在整體排序所需的cycle(latency)上做大幅的優化。在TSMC 90-nm實作排序1024個10-bit數字可以達到0.9GHz,transistor count 小於620000,排序時間在1.69μ~2.816μ之間。本論文提出新的硬體演算法使其在成本與頻率上較前人佳,同時利用(1)記錄輸入值的最大與最小值(2)使用QUEUE來過濾不需要輸出的數字(3)雙向輸出(dual path)來使得排序所需要的cycle(low latency)。
Fast、small area sorting hardware is always an important topic since sorting is used frequently in application. However prior work the scalability of prior work is poor. That is, the architecture, area and latency nonlinearly need to be modified largely as sorting input highly grow. Therefore, this paper propose a non-compare sorting algorithm with O(N) in area and speed, especially in speed. This paper not only makes the frequency faster, but also optimizes the cycle needed by sort(latency) substantially. We evaluate an application-specified integrated circuit design of our sorting algorithm for a sample sorting of N = 1024 elements of size K = 10-bit using 90-nm Taiwan Semiconductor Manufacturing Company (TSMC) technology with a 1 V power supply. Results verify that our sorting requires approximately 1.69–2.816 μs to sort the 1024 elements with a clock cycle time of 0.9 GHz, and has a total transistor count in less than 620 000. This paper which propose a new hardware algorithm making area and frequency better than prior work. In the meanwhile, this paper can also use (1) recording maximum and minimum of input(2) queue to filter number which is not needed to output(3) bidirectional output(dual path) to make the cycle needed by sort(latency) lower.
摘要 I
Abstract II
誌謝 III
Contents IV
Figure Captions VI
Table VII
Chapter 1. Introduction and Motivation 1
Chapter 2. Related Work 2
Chapter 3. Proposed Method 3
3.1 Compare-Free Sorting Algorithm 3
3.1.1 Algorithm Description 3
3.1.2 Improve Algorithm 5
3.2 Hardware Algorithm 8
3.3 Hardware Implement 15
Chapter 4. Experiments and Comparisons 19
4.1 Area and Timing 19
4.1.1 Area and Timing in sorting 1024 10-bit Numbers 19
4.1.2 Area in sorting N K-bit Numbers 19
4.2 Output Cycle 22
4.2.1 Mathematical Analysis 22
4.2.2 Queue Analysis 23
4.2.3 Random Analysis 24
4.2.4 Gaussian Distribution Analysis 25
Chapter 5. Conclusion and Future Work 26
References 27
[1]D. E. Knuth, The Art of Computer Programming. Reading, MA, USA: Addison-Wesley, Mar. 2011.
[2] Y. Bang and S. Q. Zheng, “A simple and efficient VLSI sorting architecture, in Proc. 37th Midwest Symp. Circuits Syst., vol. 1. 1994, pp. 70–73.
[3]T. Leighton, Y. Ma, and C. G. Plaxton, “Breaking the(n log2n) barrier for sorting with faults, J. Comput. Syst. Sci., vol. 54, no. 2, pp. 265–304, 1997.
[4]Y. Han, “Deterministic sorting in O(n log log n) time and linear space, J. Algorithms, vol. 50, no. 1, pp. 96–105, 2004.
[5]C. Canaan, M. S. Garai, and M. Daya, “Popular sorting algorithms, World Appl. Programm., vol. 1, no. 1, pp. 62–71, Apr. 2011.
[6]L. M. Busse, M. H. Chehreghani, and J. M. Buhmann, “The information content in sorting algorithms, in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jul. 2012, pp. 2746–2750.
[7]R. Zhang, X. Wei, and T. Watanabe, “A sorting-based IO connection assignment for flip-chip designs, in Proc. IEEE 10th Int. Conf. ASIC (ASICON), Oct. 2013, pp. 1–4.
[8] D. Fuguo, “Several incomplete sort algorithms for getting the median value, Int. J. Digital Content Technol. Appl., vol. 4, no. 8, pp. 193–198, Nov. 2010.
[9]W. Jianping, Y. Yutang, L. Lin, H. Bingquan, and G. Tao, “Highspeed FPGA-based SOPC application for currency sorting system, in Proc. 10th Int. Conf. Electron. Meas. Instrum. (ICEMI), Aug. 2011, pp. 85–89.
[10] R. Meolic, “Demonstration of sorting algorithms on mobile platforms, in Proc. CSEDU, 2013, pp. 136–141.
[11] F.-C. Leu, Y.-T. Tsai, and C. Y. Tang, “An efficient external sorting algorithm, Inf. Process. Lett., vol. 75, pp. 159–163, Sep. 2000.
[12] J. L. Bentley and R. Sedgewick, “Fast algorithms for sorting and searching strings, in Proc. 8th Annu. ACM-SIAM Symp. Discrete Algorithms (SODA), Jan. 1997, pp. 360–369.
[13] L. Xiao, X. Zhang, and S. A. Kubricht, “Improving memory performance of sorting algorithms, J. Experim. Algorithmic, vol. 5, no. 3, pp. 1–20, 2000.
[14] P. Sareen, “Comparison of sorting algorithms (on the basis of average case), Int. J. Adv. Res. Comput. Sci. Softw. Eng., vol. 3, no. 3, pp. 522–532, Mar. 2013.
[15] H. Inoue, T. Moriyama, H. Komatsu, and T. Nakatani, “AA-SORT: A new parallel sorting algorithm for multi-core SIMD processors, in Proc. 16th Int. Conf. Parallel Archit. Compil. Techn. (PACT), 2007, pp. 189–198.
[16] V. Kundeti and S. Rajasekaran, “Efficient out-of-core sorting algorithms for the parallel disks model, J. Parallel Distrib. Comput., vol. 71, no. 11, pp. 1427–1433, 2011.
[17] G. Capannini, F. Silvestri, and R. Baraglia, “Sorting on GPUs for large scale datasets: A thorough comparison, Int. Process. Manage., vol. 48, no. 5, pp. 903–917, 2012.
[18] D. Cederman and P. Tsigas, “GPU-Quicksort: A practical quicksort algorithm for graphics processors, ACM J. Experim. Algorithmics (JEA), vol. 14, Dec. 2009, Art. no. 4.
[19] B. Jan, B. Montrucchio, C. Ragusa, F. G. Ghan, and O. Khan, “Fast parallel sorting algorithms on GPUs, Int. J. Distrib. Parallel Syst., vol. 3, no. 6, pp. 107–118, Nov. 2012.
[20] N. Satish, M. Harris, and M. Garland, “Designing efficient sorting algorithms for manycore GPUs, in Proc. 23rd IEEE Int. Symp. Parallel Distrib. Process., May 2009, pp. 1–10.
[21] C. Bunse, H. Höpfner, S. Roychoudhury, and E. Mansour, “Choosing the ‘best’ sorting algorithm from optimal energy consumption, in Proc. ICSOFT, vol. 2. 2009, pp. 199–206.
[22] A. D. Mishra and D. Garg, “Selection of best sorting algorithm, Int. J. Intell. Inf. Process., vol. 2, no. 2, pp. 363–368, Jul./Dec. 2008.
[23] T.-C. Lin, C.-C. Kuo, Y.-H. Hsieh, and B.-F. Wang, “Efficient algorithms for the inverse sorting problem with bound constraints under the l∞-norm and the Hamming distance, J. Comput. Syst. Sci., vol. 75, no. 8, pp. 451–464, 2009.
[24] F. Henglein, “What is a sorting function? J. Logic Algebraic Programm., vol. 78, no. 7, pp. 552–572, Aug./Sep. 2009.
[25] E. Mumolo, G. Capello, and M. Nolich, “VHDL design of a scalable VLSI sorting device based on pipelined computation, J. Comput. Inf. Technol., vol. 12, no. 1, pp. 1–14, 2004.
[26] E. Herruzo, G. Ruiz, J. I. Benavides, and O. Plata, “A new parallel sorting algorithm based on odd-even mergesort, in Proc. 15th EUROMICRO Int. Conf. Parallel, Distrib. Netw.-Based Process. (PDP), Feb. 2007, pp. 18–22.
[27] M. Thorup, “Randomized sorting in O(n log log n) time and linear space using addition, shift, and bit-wise Boolean operations, J. Algorithms, vol. 42, no. 2, pp. 205–230, Feb. 2002.
[28]M. Afghahi, “A 512 16-b bit-serial sorter chip, IEEE J. Solid-State Circuits, vol. 26, no. 10, pp. 1452–1457, Oct. 1991.
[29]J.-T. Yan, “An improved optimal algorithm for bubble-sorting-based nonManhattan channel routing, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 18, no. 2, pp. 163–171, Feb. 1999.
[30] L. Skliarova, D. Mihhailov, V. Sklyarov, and A. Sudnitson, “Implementation of sorting algorithms in reconfigurable hardware, in Proc. 16th IEEE Medit. Electrotech. Conf. (MELECON), Mar. 2012, pp. 107–110.
[31] N. Tabrizi and N. Bagherzadeh, “An ASIC design of a novel pipelined and parallel sorting accelerator for a multiprocessor-on-a-chip, in Proc. IEEE 6th Int. Conf. ASIC (ASICON), Oct. 2005, pp. 46–49.
[32] H. Schröder, “VLSI-sorting evaluated under the linear model, J. Complex., vol. 4, no. 4, pp. 330–355, Dec. 1988.
[33] H.-S. Yu, J.-Y. Lee, and J.-D. Cho, “A fast VLSI implementation of sorting algorithm for standard median filters, in Proc. 12th Annu. IEEE Int. ASIC/SOC Conf., Sep. 1999, pp. 387–390.
[34] G. Campobello and M. Russo, “A scalable VLSI speed/area tunable sorting network, J. Syst. Archit., vol. 52, no. 10, pp. 589–602, Oct. 2006
[35]W. Zhou, Z. Cai, R. Ding, C. Gong, and D. Liu, “Efficient sorting design on a novel embedded parallel computing architecture with unique memory access, Comput. Elect. Eng., vol. 39, no. 7, pp. 2100–2111, Oct. 2013.
[36]V. Sklyarov, “FPGA-based implementation of recursive algorithms, Microprocess. Microsyst., vol. 28, nos. 5–6, pp. 197–211, Aug. 2004.
[37]R. Lin and S. Olariu, “Efficient VLSI architectures for Columnsort, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 7, no. 1, pp. 135–138, Mar. 1999.
[38]S. W. Moore and B. T. Graham, “Tagged up/down sorter—A hardware priority queue, Comput. J., vol. 38, no. 9, pp. 695–703, Sep. 1995.
[39]G. V. Russo and M. Russo, “A novel class of sorting networks, IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 43, no. 7, pp. 544–552, Jul. 1996.
[40]S. Dong, X. Wang, and X. Wang, “A novel high-speed parallel scheme for data sorting algorithm based on FPGA, in Proc. IEEE 2nd Int. Congr. Image Signal Process. (CISP), Oct. 2009, pp. 1–4.
[41]A. Széll and B. Fehér, “Efficient sorting architectures in FPGA, in Proc. Int. Carpathian Control Conf. (ICCC), May 2006, pp. 1–4.
[42]A. A. Colavita, A. Cicuttin, F. Fratnik, and G. Capello, “SORTCHIP: A VLSI implementation of a hardware algorithm for continuous data sorting, IEEE J. Solid-State Circuits, vol. 38, no. 6, pp. 1076–1079, Jun. 2003.
[43]T. Demirci, I. Hatirnaz, and Y. Leblebici, “Full-custom CMOS realization of a high-performance binary sorting engine with linear area-time complexity, in Proc. Int. Symp. Circuits Syst. (ISCAS), vol. 5. May 2003, pp. V453–V456.
[44]K. Ratnayake and A. Amer, “An FPGA architecture of stable-sorting on a large data volume: Application to video signals, in Proc. 41st Annu. Conf. Inf. Sci. Syst., Mar. 2007, pp. 431–436.
[45]S. Alaparthi, K. Gulati, and S. P. Khatri, “Sorting binary numbers in hardware—A novel algorithm and its implementation, in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2009, pp. 2225–2228.
[46]J. F. Hughes et al., Computer Graphics: Principles and Practice, 3rd ed. Reading, MA, USA: Addison-Wesley, 2014.
[47]R. Chen, S. Siriyal, and V. Prasanna, “Energy and memory efficient mapping of bitonic sorting on FPGA, in Proc. ACM/SIGDA Int. Symp. Field Program. Gate (FPGA), Monterey, CA, USA, Feb. 2015, pp. 240–249
[48]Saleh Abdel-Hafeez,Ann Gordon-Ross.An Efficient O(N) Comparison-Free Sorting Algorithm.IEEE Transactions on Very Large Scale Integration (VLSI) Systems.Pages: 1 - 13 Volume: PP Issue: 99. 2017
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top