跳到主要內容

臺灣博碩士論文加值系統

(3.231.230.177) 您好!臺灣時間:2021/08/04 12:06
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林俊杰
研究生(外文):Chun-Chieh Lin
論文名稱:適用於快取記憶體的封裝暨安置物件方法
論文名稱(外文):Cache Conscious Object Packing and Placement
指導教授:陳俊良陳俊良引用關係
指導教授(外文):Chuen-Liang Chen
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:184
中文關鍵詞:快取記憶體記憶體最佳化程式碼排列資料排列虛擬機器
外文關鍵詞:cache memorymemory optimizationcode layoutdata layoutvirtual machine
相關次數:
  • 被引用被引用:0
  • 點閱點閱:98
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
快取記憶體在分層式記憶體架構中扮演加速存取動作的角色。程式與資料物件在記憶體中的排列順序是影響快取錯失率的重要原因之一。先賢們的研究都集中在找出適用於直接映射式快取記憶體的方法,其方法是把這些物件錯置排列至各個快取關連組。錯置排列的方法有助於減少衝突性錯失。然而某些系統的記憶區塊大到可以容納許多程式或資料物件,而且被讀進快取記憶體的最小單位是記憶區塊,而非物件,因此所有物件只得互搶快取記憶體內有限的空間。
本論文提出一套方法論,旨於利用調整物件在記憶體中排列的方法來改善快取記憶體的效率。這套方法包含兩項重點:探索物件的組織,以及針對各種快取架構產生專屬的物件排列法。探索物件組織的方法涵蓋了解構資料或程式的成份,以及產生物件存取活動的軌跡。此外本文亦特別提出一個適用於虛擬機器(例如爪哇虛擬機器)的探索技術,其著眼點是基於此類程式具有特殊的軟體架構。
本論文的重點是尋求產生適用於各類快取記憶體的物件排列之法。前提是假定物件的尺寸小於記憶區塊。這意味著賦予物件地址編號必須統合兩項動作,一則是用於快取關連組錯置物件法,二則是將物件合併到記憶體區塊。本論文提出的辦法是藉由存取活動的軌跡來建立物件關連模型。在發展的過程中,文本分析了物件關連模型、快取架構、及快取錯失之間的因果關係。然而這些因果關係實際上十分困難而無法算出最佳解。因此本文也提出的頗具實用性的技術,用以產生適用於不同快取架構的物件排列法。本論文亦實作了相當繁複的實驗來驗證所提出來的方法。實驗的結果頗具有說服力,可以支持本文提出的技術的有效性。
The cache provides acceleration in access through the memory hierarchy. The order of arranging code and data objects in the main memory is an important factor that affects cache miss rates. Prior related researches focus on arranging objects interleaved between cache sets for the direct mapped cache. Interleaving the address of each items helps to resolve conflict misses. However, there are computer systems that a memory block can be large to collect a number of code and data objects, and the unit to be loaded to the cache is a memory block, not an object. Therefore, objects contend spaces within cache blocks as well.
This dissertation provides a methodology for optimizing cache memory utilization of applications in various fields by arranging their relocatable objects within the main memory. The methodology includes the exploration of object space and generation of object layouts for all kinds of cache organization. The object space exploration involves techniques in inspecting the data and program integrant and acquiring the profile of objects accesses. The exploration also contains a technique particular for the virtual machine, e.g., the Java virtual machine, because of its unique program structure.
Generating object layout adapted for cache memory is the keystone in this dissertation. The presumption is that objects are smaller than a memory block. That means assigning addresses to objects must incorporate two movements into one. The first is interleaving objects to cache sets. The second is gathering these objects to fit one cache block. Our study suggests creating the object affinity model by profile information. This study analyzes the relationship between the object affinity model, cache configurations, and cache misses. The packing and placement problem turns to be hard to find an optimal solution. Thereafter, this study proposes practical techniques of generating object layouts for different cache organizations. This dissertation also includes experiments to evaluate the proposed techniques. The experiments provide convincible results and support the effectiveness of the proposed approaches.
CHAPTER 1 INTRODUCTION 1
1.1 Motivation 1
1.2 Usefulness 4
1.3 Scope and Organization 6
CHAPTER 2 BACKGROUND 9
2.1 Memory Hierarchy 9
2.1.1 Cache Organization 10
2.1.2 XIP and NAND Flash 15
2.2 Graph and Combinatorial Algorithms 17
2.3 Related Works 21
2.3.1 Placements 21
2.3.2 XIP and NAND Flash 26
2.3.3 Locality 27
2.3.4 Other Related Topics 27
CHAPTER 3 PROBLEM MODELING 31
3.1 Object Access Trace 31
3.2 One Page Cache Model 38
3.3 Direct Mapped Cache 42
3.4 Fully Associative Cache 48
CHAPTER 4 PRACTICAL APPROACHES 57
4.1 Hardness of Packing and Placement for Direct Mapped Cache 57
4.2 Approaches for Direct Mapped Cache 63
4.2.1 Packing Followed by Placement 64
4.2.2 Placement Followed by Packing 69
4.3 Approaches for Fully Associative Cache 70
4.3.1 One-Page Cache Method 70
4.3.2 Two-Pass Partitioning Method 71
4.4 Approaches for Set Associative Cache 73
CHAPTER 5 EXPLORATIONS OF OBJECTS AND TRACES 75
5.1 Generic Data Objects 76
5.2 Generic Code Objects 77
5.2.1 Motivation 77
5.2.2 Control Flow Analysis and Basic Blocks 80
5.2.3 Benchmark Overview 85
5.3 Partial Arrangement on Performance Bottleneck 92
5.4 Virtual Machine Interpreters 94
5.4.1 KVM Internal 95
5.4.2 Analyzing Control Flow 100
5.4.2.1 Indirect Control Flow Graph 100
5.4.2.2 Tracing the Locality of the Interpreter 101
5.5 Discussion on Effectiveness and Impact of Profiling 105
CHAPTER 6 EVALUATIONS AND EXPERIMENTS 107
6.1 Experimental Setup 107
6.2 Direct Mapped Cache: Experimental Analysis 109
6.3 Fully Associative Cache: Experimental Analysis 129
6.4 Set Associative Cache: Experimental Analysis 139
6.5 Experiments on Partial Arrangement 148
6.5.1 Direct Mapped Cache Experiment 148
6.5.2 Fully Associative Cache Experiment 153
6.6 Virtual Machine Experiment 158
6.6.1 Evaluation Environment 158
6.6.2 Virtual Machine Modification Procedures 159
6.6.3 Experimental Result 163
CHAPTER 7 CONCLUSIONS 167
BIBLIOGRAPHY 171
Bibliography
[1]Raman E. and August D.I., “Chapter 5. Optimizations for memory hierarchies,” in the Handbook of Compiler Design Optimizations and Machine Code Generation, 2nd Ed. CRC Press, 2008.
[2]Gloy N., Blackwell T., Smith M.D., and Calder B., “Procedure placement using temporal ordering information,” in Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO''97), pp. 303, IEEE, 1997.
[3]Bovet D. and Cesati M., Understanding the Linux Kernel, Third Edition, Chapter 18, O''Reilly, 2005.
[4]Micron Technology, Inc, “Design and Use Considerations for NAND Flash Memory,” Micron Technology, Inc, 2006.
[5]Hennessy J.L. and Patterson D.A., Computer Architecture: A quantitative Approach, 3rd Ed, Morgan Kaufmann Publishers, 2003.
[6]Silberschatz A., Galvin P.B., and Gagne G., Operating System Concepts, 7th Ed, Wiley, 2004.
[7]Burger D.C, Goodman J.R., and Sohi G.S., “Memory System”, in Computer Science Handbook, 2nd Ed., CRC Press, 2004.
[8]Hill M.D., "A case for direct-mapped caches," Computer, Vol.21, No.12, pp.25-40, IEEE, Dec 1988.
[9]Belady L.A., “A study of replacement algorithms for a virtual-storage computer,” IBM System Journal, Vol. 5, Number 2, pp. 78, IBM, 1966.
[10]Smith A. J., “Cache memories,” ACM Computing Survey. Vol. 14, 3, pp. 473-530, ACM, 1982.
[11]Coffman E.G. and Denning P.J., Operating System Theory, Prentice-Hall. 1973.
[12]Intel, Intel Architecture Optimization Manual, Intel Corporation, 1997.
[13]Goodman J.R., “Using cache memory to reduce processor-memory traffic,” in 25 Years of the International Symposia on Computer Architecture, ACM, 1998.
[14]Przybylski S.A., Cache and Memory Hierarchy Design: A Performance-Directed Approach. Morgan Kaufmann Publishers Inc., 1990.
[15]Al-Sayed H.S., "Cache memory application to microcomputers," Tech. Rep. 78-6, Dep. of Computer Science, Iowa State Umv, Ames, Iowa, 1978.
[16]Anacker W. and Wang C.P., “Performance evaluation of computing systems with memory hmrarchles,” IEEE Transactions on Computer. TC-16, 6 (Dec. 1967), pp. 764-773, IEEE, 1967.
[17]Gibson D.H., "Consideration in block oriented systems design," in Proceedings of 1967 Spring Joint Computer Conference, Vol. 30, pp. 75-80, Thompson Books 1967.
[18]Kaplan K.R. and Winder, R.O. “Cache-based computer systems,” IEEE Computer, Vol. 6, 3 (March 1973), pp. 30-36, IEEE, 1973.
[19]Mattson R.L., “Evaluation of multilevel memories,” IEEE Transactions on Magnetics, Vol. 7, 4 (Dec. 1971), pp. 814-819, IEEE, 1971.
[20]Meade R.M., "On memory system design," in Proceedings of Fall Joint Computer Conference, Vol. 37, pp. 33-43, AFIPS Press, 1970.
[21]Strecker W.D., “Cache memories for PDP-11 family computers,” in Proceedings of the 3rd annual symposium on Computer architecture, pp.155-158, ACM, 1976.
[22]Wolf M.E. and Lam M.S., “A data locality optimizing algorithm,” ACM SIGPLAN Notices, Vol. 26, 6 (June 1991), pp. 30 - 44, ACM, 1991.
[23]Mowry T.C., Lam M.S., and Gupta A., “Design and evaluation of a compiler algorithm for prefetching.” in Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-V), pp 62-73, ACM. 1992.
[24]Park J.S., Penner M., and Prasanna V.K., "Optimizing graph algorithms for improved cache performance," IEEE Transactions on Parallel and Distributed Systems, Vol. 15, 9, pp. 769-782, IEEE, 2004.
[25]Caceres R., Douglis F., Li K., and Marsh B., “Operating system implications of solid-state mobile computers,” in Proceedings of the Fourth Workshop on Workstation Operating Systems, pp. 21-27, IEEE, 1993.
[26]Verneer D., “eXecute-In-Place,” in Memory Card Magazine, March/April, Lippincott & Peto Inc., 1991.
[27]Santarini M., “NAND versus NOR-Which flash is best for bootin’ your next system?” EDN October 2005, pp. 41-48. Reed Business Information, 2005.
[28]Park C., Seo J., Bae S., Kim H., Kim S., and Kim B., “A low-cost memory architecture with NAND XIP for mobile embedded systems,” in ISSS+CODES 2003: First IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pp. 138-143, ACM, 2003.
[29]Garey M.R. and Johnson D.S., Computer and Intractability - A Guide to the Theory of NP-Completeness. Bell Telephone Laboratories, 1979.
[30]Wang M., Lim S.K., Cong J., and Sarrafzadeh M., “Multi-way partitioning using bi-partition heuristics,” in Proceedings of the 2000 IEEE/ACM Asia South Pacific Design Automation Conference, pp. 667–672, ACM, 2000.
[31]Kernighan B.W. and Lin S., “An effective heuristic procedure for partitioning graphs”, Bell System Technology Journal, Vol. 49, pp. 291-307, 1970.
[32]Hendrickson B. and Leland R., “A multilevel algorithm for partitioning graphs,” in Proceedings of the 1995 ACM/IEEE Conference on Supercomputing, ACM, 1995.
[33]Pardalos P.M. and Hearn D., Aspects of Semidefinite Programming, Kluwer Academic Publishers, 2004.
[34]Papadimitriou C.H. and Yannakakis M., “Optimization, approximation and complexity classes,” Journal of Computer and System Science, Vol. 43, pp. 425-440, Elsevier, 1991.
[35]Karp R.M., “Reducibility among combinatorial problems,” in Complexity of Computer Computations, pp. 85–104, Plenum Press, 1972.
[36]Aho A.V., Lam M.S., Sethi R., and Ullman J.D., “Chapter 8. Code Generation,” in Compilers: Principles, Techniques, and Tools 2nd Ed., pp. 505-582, Addison-Wesley Longman Publishing Co., Inc., 2006.
[37]Goemans M.X. and Williamson D.P., “.879-approximation algorithms for MAX CUT and MAX 2SAT,” in Proceedings of the Twenty-Sixth Annual ACM Symposium on theory of Computing, pp. 422-431, ACM, 1994.
[38]Goemans M.X. and Williamson D.P., “Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming,” Journal of the ACM, Vol. 42, 6 (Nov. 1995), pp. 1115-1145, ACM, 1995.
[39]Frieze A.M. and Jerrum M., “Improved approximation algorithms for MAX k-CUT and MAX BISECTION”, in the Proceedings of 4th International IPCO Conference on Integer Programming and Combinatorial Optimization, Springer, 1995.
[40]Motwani R. and Raghavan P, Randomized algorithms. Cambridge University Press, 1995.
[41]Klerk E.D., Pasechnik D.V., and Warners J.P., “On approximate graph colouring and MAX-k-CUT algorithms based on the θ-function,” Journal of Combinatorial Optimization, Vol. 8, 3, pp. 267-294, Springer, 2004.
[42]Kann V., Khanna S., Lagergren J., and Panconesi A., “On the hardness of approximating MAX k-CUT and its dual,” Chicago Journal of Theoretical Computer Science, 1997.
[43]Kann V., Khanna S., Lagergren J., and Panconesi A., “On the hardness of approximating MAX k-CUT and its dual,” in Proceedings of Fourth Israel Symposium on Theory of Computing and Systems (ISTCS), pp. 61-67, IEEE Computer Society, 1996.
[44]Coja-Oghlan A., Moore C., and Sanwalani V., “MAX k-CUT and approximating the chromatic number of random graphs,” Random Structures and Algorithms, Vol. 28, 3, pp. 289-322, Wiley, 2005.
[45]Ghaddar B., Anjos M., and Liers F., “A branch-and-cut algorithm based on semidefinite programming for the minimum k-partition problem,” Optimization Online, 2007.
[46]Laurent M. and Rendl F., “Semidefinite programming and integer programming,” Report PNA-R0210, CWI, Amsterdam, April 2002.
[47]Kahruman S., Kolotoglu E., Butenko S., and Hicks I.V., “On greedy construction heuristics for the MAX-CUT problem,” International Journal on Computational Science and Engineering, Vol. 3, 3, pp. 211-218, Inderscience, 2007.
[48]Cho J.D., Raje S., and Sarrafzadeh M., "Fast approximation algorithms on MAXCUT, k-coloring, and k-color ordering for VLSI applications," IEEE Transactions on Computers, Vol.47, 11, pp.1253-1266, IEEE, 1998.
[49]Hwu W.W. and Chang P.P., “Achieving high instruction cache performance with an optimizing compiler,” in Proceedings of the 16th Annual International Symposium on Computer Architecture, pp. 242-251, IEEE Computer Society Press, 1989.
[50]Chang P. P. and Hwu W.W., “Trace selection for compiling large C application programs to microcode,” in the Proceedings of the 21st annual workshop on Microprogramming and microarchitecture, pp. 21-29, ACM, 1988.
[51]McFarling S., “Program optimization for instruction caches,” in Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 183–191, ACM, 1989.
[52]Pettis K. and Hansen R.C., “Profile-guided code positioning,” in Proceedings of the ACM SIGPLAN Conference on Programming Languages Design and Implementation, pp. 16–27, ACM, 1990.
[53]Gloy N. and Smith M.D., “Procedure placement using temporal ordering information,” ACM Transactions on Programming Languages and Systems (TOPLAS), Vol 21, 5, pp. 977–1027, ACM, 1999.
[54]Calder B., Krintz C., John S., and Austin T., “Cache-conscious data placement,” in Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 139-149, ACM, 1998.
[55]Sherwood T., Calder B., and Emer J., “Reducing cache misses using hardware and software page placement,” in Proceedings of the 13th International Conference on Supercomputing, pp. 155-164, ACM, 1999.
[56]Guillon C., Rastello F., Bidault T., and Bouchez F, “Procedure placement using temporal-ordering information: dealing with code size expansion,” in Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, pp. 268-279, ACM, 2004.
[57]Hashemi A.H., Kaeli D.R., and Calder B., “Efficient procedure mapping using cache line coloring,” ACM SIGPLAN Notices Vol. 32, 5, pp. 171-182, ACM, 1997.
[58]Kalamatianos J. and Kaeli D., “Temporal-based procedure reordering for improved instruction cache performance,” in the Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, pp. 244, ACM, 1998.
[59]Janapsatya A., Parameswaran S., and Henkel J., “REMcode: relocating embedded code for improving system efficiency,” In IEE Proceedings of Computers and Digital Techniques, Vol. 151, 6, pp. 457-465, IEE, 2004.
[60]Tomiyama H. and Yasuura H., "Optimal code placement of embedded software for instruction caches," in Proceedings of European Design and Test Conference 1996. pp.96-101, IEEE, 1996.
[61]Tomiyama H. and Yasuura H., “Code placement techniques for cache miss rate reduction,” ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 2, 4, pp. 410-429, ACM, 1997.
[62]Um J. and Kim T., “Code placement with selective cache activity minimization for embedded real-time software design,” in Proceedings of the International Conference on Computer Aided Design 2003 (ICCAD’03), pp. 197-200, ACM, 2003.
[63]Chilimbi T.M., Hill M.D., and Larus J.R., “Cache-conscious structure layout,” in Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation (PLDI), pp. 1 - 12, ACM, 1999.
[64]Panda P.R., Semeria L., and De Micheli G., “Cache-efficient memory layout of aggregate data structures,” in Proceedings of the 14th International symposium on Systems synthesis, pp. 101-106, ACM, 2001.
[65]Rabbah R.M. and Palem K.V., “Data remapping for design space optimization of embedded memory systems,” ACM Transactions on Embedded Computing Systems, Vol. 2, 2, pp. 186-218, ACM, 2003.
[66]Palem K.V., Rabbah R.M., Mooney V.J., Korkmaz P., and Puttaswamy K., “Design space optimization of embedded memory systems via data remapping,” in Proceedings of the Joint Conference on Languages, Compilers and Tools For Embedded Systems: Software and Compilers For Embedded Systems, LCTES/SCOPES ''02, pp. 28-37, ACM, 2002.
[67]Chilimbi T.M., Davidson B., and Larus J.R., “Cache-conscious structure definition,” in Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation (PLDI ''99), pp. 13-24, ACM, 1999.
[68]Petrank E. and Rawitz D., “The hardness of cache conscious data placement,” in Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp. 101 – 112, ACM, 2002.
[69]Petrank E. and Rawitz D., “The hardness of cache conscious data placement,” Nordic Journal of Computing, Vol. 12, 3, pp. 275 – 307, Publishing Association Nordic Journal of Computing, 2005.
[70]Panda P.R., Dutt, N.D., and Nicolau A., “Memory data organization for improved cache performance in embedded processor applications,” ACM Transactions on Design Automation of Electronic Systems, Vol. 2, 4, pp. 384–409, ACM 1997.
[71]Panda P.R., Catthoor F., Dutt N.D., Danckaert K., Brockmeyer E., Kulkarni C., Vandercappelle A., and Kjeldsberg P.G., “Data and memory optimization techniques for embedded systems,” ACM Transactions on Design Automation of Electronic Systems Vol. 6, 2, pp. 149-206, ACM, 2001.
[72]Parameswaran S. and Henkel J., "Instruction code mapping for performance increase and energy reduction in embedded computer systems," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.13, 4, pp. 498-502, IEEE, 2005.
[73]Choi Y. and Kim T., “Memory layout techniques for variables utilizing efficient DRAM access modes in embedded system design,” in Proceedings of the 40th Conference on Design Automation, pp 881-886, ACM, 2003.
[74]Choi Y. and Kim T., "Memory access driven storage assignment for variables in embedded system design," in Proceedings of the Asia and South Pacific Design Automation Conference 2004, pp. 478-481, IEEE, 2004.
[75]Hettiaratchi S. and Cheung P.Y.K., "Mesh partitioning approach to energy efficient data layout," in Proceedings of the Design, Automation and Test in Europe Conference and Exhibition 2003, pp. 1076-1081, IEEE Computer Society, 2003.
[76]Kulkarni C., Ghez C., Miranda M., Catthoor F., and De Man H., "Cache conscious data layout organization for embedded multimedia applications," in Proceedings of Design, Automation and Test in Europe 2001, pp. 686-691, IEEE, 2001.
[77]Samsung Electronics, OneNAND Features & Performance, Samsung Electronics, November 4, 2005.
[78]Park C., Lim J., Kwon K., Lee J., and Sang L.M., “Compiler assisted demand paging for embedded systems with flash memory,” in Proceedings of the 4th ACM International Conference on Embedded software (EMSOFT’04), pp. 114-124 , ACM, 2004.
[79]Denning P.J., “The working set model for program behavior,” Communications of the ACM, Vol. 11, 5, pp. 323-333, ACM, 1968.
[80]Denning P.J., “The locality principle,” Communications of the ACM, Vol. 48, 7, pp. 19-24, ACM, 2005.
[81]Denning P.J., “Working sets past and present,” IEEE Transactions on Software Engineering, Vol. 6, 1, pp. 64–84, IEEE, 1980.
[82]Rubin S., Bodik R., and Chilimbi T.M., “An efficient profile-analysis framework for data-layout optimizations,” in Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of programming language, pp. 140 - 153, ACM, 2002.
[83]Nevill-Manning C. and Witten I., “Identifying hierarchical structure in sequences: A linear-time algorithm,” Journal of Artificial Intelligence Research, Vol. 7, pp. 67-82, 1997.
[84]Chilimbi T.M., “Efficient representations and abstractions for quantifying and exploiting data reference locality,” in Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, pp. 191–202, ACM, 2001.
[85]Chilimbi T.M. and Shaham R., “Cache-conscious coallocation of hot data streams,” ACM SIGPLAN Notices, Vol. 41, 6, pp. 252-262, ACM, 2006.
[86]Ryder K., “Optimizing program placement in virtual systems,” IBM Systems Journal, Vol. 13, 4, pp. 292, IBM, 1974.
[87]Hatfield D.J. and Gerald J., “Program restructuring for virtual memory,” IBM Systems Journal, Vol. 10, 3, pp. 168, IBM, 1971.
[88]Xu R. and Li Z., “Using cache mapping to improve memory performance handheld devices,” in Proceedings of 2004 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS''04), pp. 106-114, IEEE, 2004.
[89]Stamos J.W., “Static grouping of small objects to enhance performance of a paged virtual memory,” in ACM Transactions on Computer Systems (TOCS), Vol. 2, 2, pp. 155–180, ACM, 1984.
[90]Hirzel M., “Data layouts for object-oriented programs,” in Proceedings of SIGMETRICS ''07 Conference, pp. 265 – 276, ACM, 2007.
[91]Zhao Q., Rabbah R., and Wong W., “Dynamic memory optimization using pool allocation and prefetching”, ACM SIGARCH Computer Architecture News, Vol. 33, 5, pp. 27-32, ACM, 2005.
[92]Chilimbi T.M. and Larus J.R., "Using generational garbage collection to implement cache-conscious data placement," in Proceedings of the International Symposium on Memory Management (ISMM-98) of ACM SIGPLAN Notices, Vol. 34, 3, pp. 37–48, ACM, 1998.
[93]LaMarca A. and Ladner R., “The influence of caches on the performance of heaps,” Journal of Experimental Algorithmic, Vol. 1, ACM, 1996.
[94]Seidl M.L. and Zorn B.G., “Segregating heap objects by reference behavior and lifetime,” in Proceedings of the Eighth International Conference on Architectural Support For Programming Languages and Operating Systems, pp. 12-23, ACM, 1998.
[95]Truong D.N., Bodin F., and Seznec A., "Improving cache behavior of dynamically allocated data structures," in Proceedings of PACT’98, Conference on Parallel Architectures and Compilation Techniques, pp. 322–329, ACM, 1998.
[96]Allen R., and Kennedy K., Optimizing Compilers for Modern Architectures: A Dependence-based Approach, 1st Ed., Morgan Kaufmann, 2001.
[97]Smith J¬.E. and Goodman J.R., “A study of instruction cache organizations and replacement policies,” in Proceedings of the 10th Annual International Symposium on Computer Architecture, pp. 132-137, IEEE Computer Society, 1983.
[98]Steinke S., Grunwald N., Wehmeyer L., Banakar R., Balakrishnan M., and Marwedel P., “Reducing energy consumption by dynamic copying of instructions onto onchip memory,” in Proceedings of the 15th International Symposium on System Synthesis, pp. 213-218, ACM, 2002.
[99]Blazewicz J., Kubiak W., Morzy T., and Rusinkiewicz M., Handbook on Data Management in Information Systems, Springer, 2003.
[100]METIS, Karypis Lab, Department of Computer Science & Engineering, University of Minnesota, http://glaros.dtc.umn.edu/gkhome/software.
[101]Govindarajan R., “Chapter 19. Instruction Scheduling,” in the Handbook of Compiler Design Optimizations and Machine Code Generation, 2nd Ed. CRC Press, 2008.
[102]Ball T. and Larus J. R., “Optimally profiling and tracing programs,” ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 16,4, pp. 1319-1360, ACM, 1994.
[103]Sun Microsystems, J2ME Building Blocks for Mobile Devices, Sun Microsystems, May 19, 2000.
[104]Lafond S. and Lilius J., “An energy consumption model for java virtual machine,” Turku Centre for Computer Science TUCS Technical Report No 597, TUCS, March 2004.
[105]CaffeineMark 3.0, Pendragon Software Corp, http://www.benchmarkhq.ru/cm30.
[106]Fuber S., ARM System-on-Chip Architecture, 2nd Ed., Addison-Wesley, 2000.
[107]Huang X., Lewis B.T., and McKinley K.S, “Dynamic code management: improving whole program code locality in managed runtimes,” in Proceedings of the 2nd International Conference on Virtual execution environments, pp. 133-143, ACM, 2006.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top