臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.217.144) 您好！臺灣時間：2026/04/26 14:53

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
論文連結
QR Code

本論文永久網址:

研究生:

傅勝余

研究生(外文):

Fu, Sheng-Yu

論文名稱:

在一個動態轉譯引擎中優化SIMD指令之生成

論文名稱(外文):

Improvement of SIMD Code Generation in a Dynamic Binary Translator

指導教授:

徐慰中

指導教授(外文):

Hsu, Wei-Chung

學位類別:

碩士

校院名稱:

國立交通大學

系所名稱:

資訊科學與工程研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2014

畢業學年度:

102

語文別:

英文

論文頁數:

中文關鍵詞:

模擬器

外文關鍵詞:

QEMU

相關次數:

被引用:0
點閱:301
評分:
下載:0
書目收藏:0

Modern processors are increasingly enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instructions in the x86 architecture, and the Neon instruction set in the ARM architecture are all SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a greater fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement.
In this thesis, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The two approaches have been implemented in the QEMU with ARM frontend and x86-64 backend. In our experiment, the vector IR QEMU is 1.01 to 5.55 times faster than original QEMU with benchmark SPEC2006 CFP and 7.61 times faster than original QEMU with benchmark Linpack.

Table of Contents
Abstract i
誌謝 ii
Table of Contents iii
List of Figures v
Ⅰ. Introduction 1
Ⅱ. Background and Related Work 4
2.1 Binary Translator 4
2.1.1 Static Binary Translator 4
2.1.2 Dynamic Binary Translator 4
2.2 SIMD instructions 5
2.2.1 Intel’s SSE 6
2.2.2 ARM’s NEON 8
2.3 QEMU 9
Ⅲ. Design and Implementation 12
3.1 Observation and Objective 12
3.2 Design Issues 14
3.2.1 Approach 1: Modify the helper functions 14
3.2.2 Approach 2: Add vector IR to original TCG IR 15
3.2.3 Approach 1 VS Approach 2 16
3.3 Original QEMU Internal Overview 17
3.4. Implementation Detail for Approach 1 19
3.4.1 Register a Helper Function 19
3.4.2 Implementing helper function body 20
3.4.3 Ask Tiny Code Generator to Generate Helper Function Call 20
3.5 Vector IR Version QEMU Internal Overview 20
3.6 Implementation Detail for Approach 2 21
3.6.1 Register a New TCG IR 21
3.6.2 The Frontend: From NEON instruction to TCG Vector IR 22
3.6.3 The Backend: From Vector IR to Host Instruction 24
3.6.4 Translation Examples 25
3.6.5 Prologue and Epilogue 28
Ⅳ. Experimental Result 31
4.1 Environment 31
4.2 NEON Instruction Influence 31
4.3 Performance 32
4.3.1 SPEC 2006 33
4.3.2 Single Precision SPEC 2006 CFP and Linpack 34
Ⅴ. Conclusion and Future Work 38
Reference 39

Reference

[1] R.L. Sites, A. Chernoff, M. B. Kirk, M. P. Marks and S. G. Robinson, “Binary translation”, Communications of the ACM, Volume 36 Issue 2, Feb. 1993
[2] Anton Chernoff , Mark Herdeg , Ray Hookway , Chris Reeve , Norman Rubin , Tony Tye , S. Bharadwaj Yadavalli and John Yates,” FX!32 - A Profile-Directed Binary Translator”, IEEE Micro, 1998
[3] J-Y Chen, W Yang, T-H Hung, H-M Su, W C Hsu, “A static binary translator for efficient migration of ARM-based applications”, the 6th Workshop on Optimizations for DSP and Embedded Systems, 2008
[4] Nicholas Nethercote and Julian Seward, “Valgrind: a framework for heavyweight dynamic binary instrumentation”, ACM SIGPLAN Notices - Proceedings of the 2007 PLDI conference, 2007
[5] Vasanth Bala, Evelyn Duesterwald and Sanjeev Banerjia, “Dynamo: a transparent dynamic optimization system”, PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference, 2000
[6] Bob Cmelik, David Keppel, “Shade: A Fast Instruction-Set Simulator for Execution Profiling”, 94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, Pages 128-137, 1994
[7] C. Cifuentes, V. Malhotra, “Binary translation: static, dynamic, retargetable?”, Software Maintenance 1996, Proceedings., International Conference on, 1996
[8] Ding-Yong Hong, Jan-Jan Wu, Pen-Chung Yew, Wei-Chung Hsu, Chun-Chen Hsu, Pangfeng Liu, Chien-Min Wang, and Yeh-Ching Chung,” HQEMU: A Multi-Threaded and Retargetable Dynamic Binary Translator on Multicores”, Proceedings of the Tenth Annual IEEE/ACM International Symposium on Code Generation and Optimization, (CGO-2012), Apr. 2012.
[9] Bellard, Fabrice. "QEMU, a Fast and Portable Dynamic Translator." USENIX Annual Technical Conference, FREENIX Track. 2005.
[10] Flynn, Michael J., and Kevin W. Rudd. "Parallel architectures." ACM Computing Surveys (CSUR) 28.1 (1996): 67-70.
[11] “Intel® 64 and IA-32 Architectures Software Developer Manuals”, [Online]. Available: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals
[12] “”ARM Architecture Reference Manual ARMv7-A and ARMv7-R Edition, [Online]. Available: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0406b/index.html
[13] “ARM online document”, [Online]. Available: http://www.arm.com/products/processors/technologies/neon.php“Using Vector Instructions through Built-in Functions”[Online]. Available: http://GCC.gnu.org/onlinedocs/GCC/Vector-Extensions.html
[14] Lattner, Chris. "Introduction to the llvm compiler system." Proceedings of International Workshop on Advanced Computing and Analysis Techniques in Physics Research, Erice, Sicily, Italy. 2008.
[15] “LLVM Language Reference Manual” [Online]. Available: http://llvm.org/docs/LangRef.html#vector-type

國圖紙本論文

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供，不一定有電子全文可供下載，若連結有誤，請點選上方之〝勘誤回報〞功能，我們會盡快修正，謝謝！

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

無相關論文

無相關期刊

1.	基於進階HQEMU之動態二進制碼向量化
2.	類神經網路在異質系統架構訓練的高效能動態裝置配置方法
3.	以光線追蹤演算法為例之支援非規律型程式可適性異質多核心執行期軟體函式庫設計
4.	在HQEMU系統模擬器的動態二元翻譯引擎上產生SIMD指令
5.	RISC-V 向量指令集鏈結微架構評估
6.	量化類神經網路的近似運算
7.	具有硬體支援之 ARM 架構上效能監控計數器的虛擬化
8.	模擬氮化銦鎵太陽能電池在非極性面藍寶石基版上生長
9.	電力線磁能採集與充電系統
10.	2.4 GHz無線網路晶片射頻輸出於開機IQ校正時產生諧波信號的研究
11.	低溫液態製程金屬氧化物薄膜電晶體
12.	利用有機金屬化學氣相沉積法成長氮化鎵薄膜於電子束蒸鍍金屬層
13.	使用執行期配置與轉譯技術達成可適性異質多核心計算
14.	一個為異質系統仿真器以LLVM為基準的二元轉譯器
15.	在LnQ架構下實作ARM到x86-64的執行碼轉換

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室