跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.170) 您好!臺灣時間:2024/12/07 19:25
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳鴻奎
研究生(外文):Hong-Kuei Chen
論文名稱:可輔助平行化剖析之編譯器前端架構
論文名稱(外文):Compiler Front-End Framework for Parallelism Profiling
指導教授:陳少傑陳少傑引用關係
指導教授(外文):Sao-Jie Chen
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電子工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:英文
論文頁數:67
中文關鍵詞:編譯器前端平行化粗顆粒剖析多核
外文關鍵詞:compilerfront-endparallelismcoarse-grained profilingmulti-core
相關次數:
  • 被引用被引用:0
  • 點閱點閱:595
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
當多核處理器(chip multiprocessor)變成處理器的主流時,如何使得程式在多核處理器上能夠更有效率的執行的研究問題顯得更為重要。不幸的是,傳統編譯器注重單核處理器以及循序的執行方式缺乏攫取程式平行的能力。本論文將發展一個可輔助平行化剖析的編譯器前端架構,這個架構提出三個不同的觀點的資訊對平行能力的攫取很大的幫助,這三個觀點分別是:靜態觀點(static view),動態觀點(dynamic view),機率觀點(probabilistic view)。
靜態觀點使用程式相依圖(program dependence graph)及中間碼(intermediate representation)於我們的架構以求得程式碼與程式碼之間是否存在可平行執行的關係。動態觀點使用粗糙剖析器(coarse-grained profiler)收集程式執行過程的資訊,其中包含程式熱點(program hot spots)和記憶體存取與時間的關係。機率觀點使用Valgrind剖析器收集快取記憶體失誤比率(cache miss rate)。這些剖析的資訊可以幫助發展應用於多核處理器的嵌入式系統使之更有效率。
Embedded systems on multi-core are nearly ubiquitous today such as handheld mobile phones and game consoles. When developing embedded systems on multi-cores, we need a profiler to analyze an application and find out whether it can be executed in parallel or which part of it can be executed in parallel, which will help a designer to decide how to use system resources well. In the Thesis, we developed a compiler front-end framework for parallelism profiling. Our proposed framework can extract the information on whether one application can be executed in parallel or not. These information are contributive to developing embedded systems on a multi-core environment.
ABSTRACT i
LIST OF FIGURES v
LIST OF TABLES vii
CHAPTER 1. INTRODUCTION 1
1.1 Motivation 1
1.2 Objectives 2
1.3 Thesis Organization 2
CHAPTER 2. PRELIMINARIES 3
2.1 Overview of a Traditional Compiler 3
2.1.1 Lexical Analysis 5
2.1.2 Syntax Analysis 6
2.1.3 Abstract Syntax 8
2.1.4 Semantic Analysis 8
2.1.5 Symbol-Table Management 9
2.1.6 Intermediate Representation 10
2.1.7 Code Optimization 11
2.1.8 Code Generation 13
2.2 Overview of Profiling 14
CHAPTER 3. RELATED WORK 17
3.1 Related Compiler and Compiler Front-End 17
3.1.1 GNU Compiler Collection 17
3.1.2 LANCE Compiler ………………....... 19
3.1.3 Edison Design Group 21
3.2 Related Intermediate Representation 21

3.2.1 Control and Data Flow Graph 21
3.2.2 Hierarchical Task Graph 23
3.2.3 Program Dependence Graph 25
3.3 Related Profilers 26
3.3.1 GNU Profilers: Gcov and Gprof 26
3.3.2 Valgrind Profiler 29
3.3.3 Micro-Profiler 30
CHAPTER 4. COMPILER FRONT-END FRAMEWORK FOR PARALLELISM PROFILING 33
4.1 Compiler Front-End Framework 33
4.1.1 Scanner and Parser 34
4.1.2 Abstract Syntax Tree 35
4.1.3 Symbol-Table Management 36
4.1.4 Three Address Code Generator 39
4.1.5 Control and Data Flow Graph Generator and Program Dependence Graph Generator 42
4.2 Coarse-Grained Profiling 49
4.2.1 Keep Information of Program Structures 51
4.2.2 Identify Programs Hot Spots and Categorization 51
4.2.3 Explore Program Semantics for Parallelism 55
4.2.4 Explore Program Semantics for Hardware/Software Co-design 57
CHAPTER 5. VALIDATION METHODOLOGY AND RESULTS 59
5.1 Background 59
5.2 Validation Methodology and Results 60
CHAPTER 6. CONCLUSION AND FUTURE WORK 63
6.1 Conclusion 63
6.2 Future Work 63
REFERENCE 65
[1]M. E. Wolf and M. S. Lam, “A Loop Transformation Theory and an Algorithm to Maximize Parallelism,” IEEE Trans. Parallel Distributed Systems, vol. 2 issue 4, pp. 452-471, Oct. 1991.
[2]K. Kennedy, and K. S. McKinley, “Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution,” in Proceedings of the Int. Workshop on Languages and Compilers for Parallel Computing, pp. 301-320, Aug. 1993.
[3]K. G. Kumar, D. Kulkarni, and A. Basu, “Deriving Good transformations for Mapping Nested Loops on Hierarchical Parallel Machines in Polynomial Time,” in Proceedings of the 6th international conference on Supercomputing, pp. 82-92, Jul. 1992.
[4]A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools, 2nd ed., Addison Wesley, Oct. 2007.
[5]A. I. Holub, Compiler Design in C, Prentice Hall, Mar. 1990.
[6]J. Jones, “Abstract Syntax Tree Implementation Idioms,” in Proceedings of the 10th Conference on Pattern Languages of Programs, Sep. 2003.
[7]GNU Compiler Collection, URL: http://gcc.gnu.org/
[8]D. R. Wallace, “Low level scheduling using the hierarchical task graph,” in Proceedings of the 6th international conference on Supercomputing (ICS), pp. 72-81, Jul. 1992.
[9]SPARK 3-Layered Intermediate representation, URL: http://mesl.ucsd.edu/spark/methodology/HTGs.shtml/
[10]M. Girkar and C. D. Polychronopoulos, “The hierarchical task graph as a universal intermediate representation,” International Journal of Parallel Programming, vol. 22, issue 5, Oct. 1994.
[11]D. Gajski, N. Dutt, A. Wu, S. Lin, High-Level Synthesis: Introduction to Chip and System Design, Kluwer Academic Publishers, Feb. 1992.
[12]J. Ferrante, K. J. Ottenstein, and J. D. Warren, “The program Dependence Graph and its Use in Optimization,” ACM Trans. Programming Languages and Systems, vol. 9, issue 3, pp. 319-349, Jul. 1987.
[13]M. J. Harrold, B. Malloy, and G. Rothermel, “Efficient Construction of Program
Dependence Graphs,” in Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), pp. 160-170, Jun. 1993.
[14]D. Novillo, “GCC - An Architectural Overview, Current Status and Future Directions,” Ottawa Linux Symposium (OLS), Jul. 2006.
[15]LANCE Retargetable C Compiler, URL: http://www.lancecompiler.com/
[16]R. Leupers, “LANCE: A C Compiler Platform for Embedded Processors,” in Embedded Systems/Embedded Intelligence, Feb. 2001.
[17]K. Karuri, M. A. Al Faruque, S. Kraemer, R. Leupers, G. Ascheid, and H. Meyr, “Fine-grained Application Source Code Profiling for ASIP Design,” in Proceedings of the 42nd Design Automation Conference, pp. 329-334, Jun. 2005.
[18]K. Karuri, C. Huben, R. Leupers, G. Ascheid, H. Meyr, “Memory Access Micro-Profiling for ASIP Design,” in Proceedings of the 3rd IEEE International Workshop on Electronic Design, Test and Applications, pp. 255-262, Jan. 2006.
[19]Edison Design Group (EDG), URL: http://www.edg.com/
[20]Valgrind: URL: http://valgrind.org/
[21]J. Engblom, A. Ermedahl, M. Nolin, J. Gustafsson, and H. Hansson, “Worst-case execution-time analysis for embedded realtime systems,” International Journal on Software Tools for Technology Transfer, vol. 4, issue 4, pp. 437-455, Aug. 2003.
[22]M. I. Gordon, W. Thies, and S. P, Amarasinghe, "Exploiting coarse-grained task, data, and pipeline parallelism in stream programs,” in Proceedings of the 12th international conference on Architectural support for programming languages and operating systems (ASPLOS), pp. 151-162, Oct. 2006.
[23]S. Rul, H. Vandierendonck, and K. D. Bosschere, “Extracting Coarse-Grain Parallelism in General-Purpose Programs,” in Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 281-282, Feb. 2008.
[24]E. Ozer, S. Banerjia, and T. Conte, “Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures,” in Proceedings of the 31st Annual International Symposium on Microarchitecture, pp. 308-315, Dec. 1998.
[25]K. Kailas, K. Ebcioglu, and A. Agrawala, “CARS: A new code generation framework for clustered ILP processors,” in Proceedings of the 7th International Symposium on High-Performance Computer Architecture, pp. 133-142, Feb. 2001.
[26]M. Chu, K. Fan, and S. Mahlke, “Region-based hierarchical operation partitioning for multicluster processors,” in Proceedings of the SIGPLAN 2003 Conference on Programming Language Design and Implementation, pp. 300-311, Jun. 2003.
[27]V. Paxson, Flex: The Fast Lexical Generator, URL: http://www.gnu.org/software/flex/
[28]C. Donnelly and R. Stallman, Bison: GNU parser generator. URL: http://www.gnu.org/software/bison/
[29]I. Sommerville, Software Engineering, 8th ed., Addison Wesley, Jun. 2006.
[30]Gcov documentation. URL: http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.
[31]J. Fenlason and R. Stallman. The GNU Profiler URL: http://www.gnu.org/software/binutils/manual/gprof-2.9.1/gprof.html.
[32]S. Horwitz, T. Reps, and D. Binkley, “Interprocedural Slicing Using Dependency Graphs,” ACM Trans. on Programming Languages and Systems, vol. 22 issue 1, pp. 26-60, Jan. 1990.
[33]R. Leupers, O. Wahlen, M. Hohenauer, T. Kogel, and P. Marwedel, “An Executable Intermediate Representation for Retargetable Compilation and High-Level Code Optimization,” in Proceedings of the Int. Workshop on Systems, Architectures, Modeling, and Simulation (SAMOS), pp. 120-125, Jul. 2003.
[34]M. Chu, R. Ravindran, and S. Mahlke, “Data Access Partitioning for Fine-grain Parallelism on Multicore Architectures,” in Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 369-380, Dec. 2007.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top