跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.91) 您好!臺灣時間:2025/03/16 11:54
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林春益
研究生(外文):Chun-Yi Lin
論文名稱:適用於可攜式裝置之多核心三維繪圖系統之工作排程器與資源排程器
論文名稱(外文):Task Scheduler and Resource Scheduler on Multi-Core GPUs for Mobile Devices
指導教授:簡韶逸
口試委員:陳炳宇楊佳玲范倫達
口試日期:2010-10-18
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電子工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2010
畢業學年度:99
語文別:中文
論文頁數:96
中文關鍵詞:多核心三維繪圖系統工作排程器資源排程器
外文關鍵詞:Multi-Core GPUsMobile DevicesTask SchedulerResource Scheduler
相關次數:
  • 被引用被引用:0
  • 點閱點閱:371
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
最近這幾年來,愈來愈多的多媒體應用在消費性電子商品上。發展蓬勃的手持式裝置如智慧型手機,除了基本的通話服務之外,還支援多種多媒體應用,如相機、錄影、網路、GPS導航、MP3隨身聽等等,大大地影響到人類的生活。而在手持式裝置上配備三維繪圖處理器(GPUs)的情況也越來越普遍。由於螢幕是人機之間最重要的介面,使用者對畫面上通常都會要求高畫質,這也意味著三維繪圖處理器所需要處理的運算也越來越複雜;除了高畫質的訴求,高解析度的螢幕不可或缺。高畫質及高解析度的螢幕要求,都直接的顯示出在三維繪圖處理器中,處理運算的核心單元也必須跟著變多。而這對資源極有限的手持式裝置來說,不失為一大挑戰。如何在固定的資源下,讓多個運算核心達到比較高的使用率以及讓各個核心中需要用來運算的資料使用最佳化,將會是一大考驗。

在這篇論文中,我們提出了一個手提式多核心繪圖處理器(Multi-Core Mobile GPU):利用動態配置的工作排程器(Task Scheduler)來分配各個需要運算的工作給各個運算核心單元,來達到各個運算核心之間的負擔平衡(Load balance);也利用資源排程器(Resource Scheduler),來提前搬移各個運算核心單元所需要的資料,使得各筆資料,能夠更有效率的被搬移並使用,進而提高每個著色處理器的管線效率,

綜合以上技術,我們把原本只有兩個運算核心的手提式繪圖處理器,擴充到了具有八個運算核心,並加入了所提出的工作排程器和資源排程器,並將之實現成一個系統晶片平台,晶片利用台積電65nm 技術製成,面積為4.0×4.0mm2,其工作頻率為200MHz,最大消耗功率為212.84mW


Recently, the multimedia application in consumer electronics has become more
and more prosperous. Among all, the smart phones are not just phones. There
are many multimedia applications such as games, camera, video, internet, GPS,
MP3 etc embedded into the smart phones. Usually, there is a 3D graphics processing
unit (GPU) which is embedded into mobile devices to enhance the processing
capability of the multimedia applications. Since screen is the most important interface
between the users and mobile devices, the high resolution and high quality
screens are indispensible for users that use the mobile devices. However, higher
resolution screen and higher quality display both mean that the processing power
of the GPU must be increased which means the number of processing cores must
be increased. This is a challenge since there are many limitations on mobile devices.
How to maximize the utilization of the multi-core processor and how to
make the efficient usage of the resource for processing will be a design challenge.
In this thesis, we proposed a multi-core mobile GPU for mobile devices. We
propose a dynamic task scheduler to dynamically assign the subtask to each processor
to make the whole system more load balance; besides, we also propose a
resource scheduler to efficiently prefetch the resource that is required by processor
to make the whole processing pipeline more efficient.
Based on the above techniques, we extend our original two processing shader
cores GPU to a eight core unified shader GPU with task scheduler and resource
scheduler added. We also realize these to a SOC platform, our chip is fabricated
by TSMC 65nm technology. The chip area is 4mmx4mm and the chip working frequency is at 200 MHz. The maximum power consumption is 210mW. The
maximum processing capabilities is 800MVtx/sec and 1600MPxl/sec.

Abstract xi
1 Introduction 1
1.1 The Graphic Pipeline . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Multi-Core Graphics Processing Units . . . . . . . . . . . . . . . 5
1.3 The Trend of Mobile Graphics Processing Units . . . . . . . . . . 8
1.4 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 PreviousWork 13
2.1 The latest work on desktop GPU . . . . . . . . . . . . . . . . . . 13
2.2 Previous work on mobile GPUs . . . . . . . . . . . . . . . . . . . 15
3 Universal Scheduler 19
3.1 Overview of shader . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1 Vertex Shader . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2 Pixel Shader . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Conventional instruction and data accessing . . . . . . . . . . . . 22
3.3 Proposed Resource Scheduler . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Resource Scheduler Overview . . . . . . . . . . . . . . . 25
3.3.2 Instruction Scheduler . . . . . . . . . . . . . . . . . . . . 26
3.3.3 Constant Scheduler . . . . . . . . . . . . . . . . . . . . . 29
3.4 Proposed Task Scheduler . . . . . . . . . . . . . . . . . . . . . . 37
3.4.1 Shader Configuration . . . . . . . . . . . . . . . . . . . . 40
3.4.2 Task Selection . . . . . . . . . . . . . . . . . . . . . . . 40
3.4.3 Task Scheduling and Assignment . . . . . . . . . . . . . 41
3.5 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5.1 Proposed Scheduler Architecture Overview . . . . . . . . 42
3.5.2 Task Scheduler . . . . . . . . . . . . . . . . . . . . . . . 45
3.5.3 Resource Scheduler . . . . . . . . . . . . . . . . . . . . . 45
4 Experimental Results 59
4.1 Simulation Environment and Setup . . . . . . . . . . . . . . . . . 59
4.2 Experiments about task scheduler . . . . . . . . . . . . . . . . . 60
4.2.1 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.2 Task scheduling methods . . . . . . . . . . . . . . . . . . 63
4.3 Experiments about resource scheduler . . . . . . . . . . . . . . . 66
4.3.1 Simple Scenes with the Same Buffer Size . . . . . . . . . 67
4.3.2 Simple Scenes with Larger Buffer Size on Cache Architecture
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.3 Different Buffer Size . . . . . . . . . . . . . . . . . . . . 72
4.3.4 Rendering Resolution . . . . . . . . . . . . . . . . . . . 72
4.3.5 Multi-Core Configuration . . . . . . . . . . . . . . . . . 77
4.3.6 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 The Implementation of Multi-Core Graphics Processing Units for Mobile
Multimedia Applications 81
5.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3 Synthesis Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 Chip Layout and Specification . . . . . . . . . . . . . . . . . . . 85
5.5 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6 Conclusion 91
Reference 93

[1] You-Ming Tsao, Chih-Hao Sun, Yu-Cheng Lin, Ka-Hang Lok, Chia-Jung
Hsu, Shao-Yi Chien, and Liang-Gee Chen, “A 26mw 6.4gflops multi-core
stream processor for mobile multimedia applications,” jun. 2008, pp. 24 –25.
[2] Silicon Graphics limited, “The OpenGL Graphics System: A Specification
- Version 1.1,” .
[3] H. Gouraud, “Continuous shading of curved surfaces,” IEEE Trans. Comput.,
vol. 20, no. 6, pp. 623–629, 1971.
[4] Bui-Tuong Phong, “Illumination for Computer Generated Pictures,” vol. 18,
no. 6, pp. 311–317, 1975.
[5] NVIDIA, “The Infinite Effects GPU - GeForce3,” .
[6] David B. Kirk andWen-meiW. Hwu, Programming Massively Parallel Processors:
A Hands-on Approach, Morgan Kaufmann, 1 edition, February
2010.
[7] NVIDIA, “The Infinite Effects GPU - GeForce6,” .
[8] Michael Oneppo, “Hlsl shader model 4.0,” in SIGGRAPH ’07: ACM SIGGRAPH
2007 courses, New York, NY, USA, 2007, pp. 112–152, ACM.
[9] David Blythe, “The direct3d 10 system,” in SIGGRAPH ’06: ACM SIGGRAPH
2006 Papers, New York, NY, USA, 2006, pp. 724–734, ACM.
[10] ARM, “ARM Mali 400MP,” .
[11] Imagination Tech., “PowerVR Series,” .
[12] Masatoshi Kameyama, Yoshiyuki Kato, Hitoshi Fujimoto, Hiroyasu
Negishi, Yukio Kodama, Yoshitsugu Inoue, and Hiroyuki Kawai, “3d graphics
lsi core for mobile phone ”z3d”,” in HWWS ’03: Proceedings of the ACM
SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, Aire-la-
Ville, Switzerland, Switzerland, 2003, pp. 60–67, Eurographics Association.
[13] NVIDIA, “GeForce 8 series,” .
[14] R. Fromm, S. Perissakis, N. Cardwell, C. Kozyrakis, B. McGaughy, D. Patterson,
T. Anderson, and K. Yelick, “The energy efficiency of iram architectures,”
jun. 1997, pp. 327 –337.
[15] Ramchan Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don
Bae, Chi-Weon Yoon, Byeong-Gyu Nam, Jeong-Ho Woo, Sung-Eun Kim,
In-Cheol Park, Sungwon Shin, Kyung-Dong Yoo, Jin-Yong Chung, and
Hoi-Jun Yoo, “A 210mw graphics lsi implementing full 3d pipeline with
264mtexels/s texturing for mobile multimedia applications,” feb. 2003, pp.
44 – 476 vol.1.
[16] Masatoshi Kameyama, Yoshiyuki Kato, Hitoshi Fujimoto, Hiroyasu
Negishi, Yukio Kodama, Yoshitsugu Inoue, and Hiroyuki Kawai, “3d graphics
lsi core for mobile phone ”z3d”,” in HWWS ’03: Proceedings of the ACM
SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, Aire-la-
Ville, Switzerland, Switzerland, 2003, pp. 60–67, Eurographics Association.
[17] Silicon Graphics limited, “OpenGL ES 2.X and the OpenGL ES Shading
Language,” .
[18] Ju-Ho Sohn, Ramchan Woo, and Hoi-Jun Yoo, “A programmable vertex
shader with fixed-point simd datapath for low power wireless applications,”
in Graphics Hardware, 2004, pp. 107–114.
[19] Min wuk Lee, Byeong-Gyu Nam, Ju-Ho Sohn, Namjun Cho, Hyejung Kim,
Kwanho Kim, and Hoi-Jun Yoo, “A fixed-point 3d graphics library with
energy-efficient cache architecture for mobile multimedia systems,” in ISCAS
(5), 2005, pp. 4602–4605.
[20] Jeong-Ho Woo, Ju-Ho Sohn, Hyejung Kim, Jongcheol Jeong, Euljoo Jeong,
Suk Joong Lee, and Hoi-Jun Yoo, “A low power multimedia soc with fully
programmable 3d graphics and mpeg4/h.264/jpeg for mobile devices,” in
ISLPED, 2007, pp. 238–243.
[21] Jeong-Ho Woo, Ju-Ho Sohn, Hyejung Kim, Jongcheol Jeong, Euljoo Jeong,
Suk Joong Lee, and Hoi-Jun Yoo, “A 152mw mobille multimedia soc with
fully programmable 3d graphics and mpeg4/h.264/jpeg,” jun. 2007, pp. 220
–221.
[22] The Khronos Group Inc., “The OpenGL Shading Language,” .
[23] ARM Limited, “AMBA Specification (Rev 2.0),” .
[24] KenW. Batcher and Robert A.Walker, “Dynamic round-robin task scheduling
to reduce cache misses for embedded systems,” in DATE ’08: Proceedings
of the conference on Design, automation and test in Europe, New York,
NY, USA, 2008, pp. 260–263, ACM.
[25] Kayvon Fatahalian, Edward Luong, Solomon Boulos, Kurt Akeley, William
R. Mark, and Pat Hanrahan, “Data-parallel rasterization of micropolygons
with defocus and motion blur,” in HPG ’09: Proceedings of the Conference
on High Performance Graphics 2009, New York, NY, USA, 2009, pp. 59–
68, ACM.
[26] M. Doggett and S. Laine, “Hardware implementation of micropolygon
rasterization withmotion and defocus blur j.s.brunhaver, k.fatahalian, and
p.hanrahan,” .
[27] Morgan McGuire, Eric Enderton, Peter Shirley, and David Luebke,
“Hardware-accelerated stochastic rasterization on conventional gpu architectures,”
in Proceedings of High Performance Graphics 2010, June 2010.
[28] You-Ming Tsao, Ka-Hang Lok, Yu-Cheng Lin, Chih-Hao Sun, Shao-Yi
Chien, and Liang-Gee Chen, “A cost effective reconfigurable memory for
multimedia multithreading streaming architecture,” may. 2008, pp. 3406
–3409.
[29] Chih-Hao Sun, You-Ming Tsao, Ka-Hang Lok, and Shao-Yi Chien, “Universal
rasterizer with edge equations and tile-scan triangle traversal algorithm
for graphics processing units,” jun. 2009, pp. 1358 –1361.
[30] Chih-Hao Sun, Ka-Hang Lok, You-Ming Tsao, Chia-Ming Chang, and
Shao-Yi Chien, “Cfu: multi-purpose configurable filtering unit for mobile
multimedia applications on graphics hardware,” in HPG ’09: Proceedings of
the Conference on High Performance Graphics 2009, New York, NY, USA,
2009, pp. 29–36, ACM.
[31] Byeong-Gyu Nam, Jeabin Lee, Kwanho Kim, Seung Jin Lee, and Hoi-Jun
Yoo, “A 52.4mw 3d graphics processor with 141mvertices/s vertex shader
and 3 power domains of dynamic voltage and frequency scaling,” feb. 2007,
pp. 278 –603.
[32] Jae-Sung Yoon, Donghyun Kim, Chang-Hyo Yu, and Lee-Sup Kim, “A 3d
graphics processor with fast 4d vector inner product units and power aware
texture cache,” sep. 2008, pp. 539 –542.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
1. [11] 林穎芬、劉維琪,「從高階主管薪酬的研究探討代理理論在台灣的適用程度」,管理學報,第二十卷,第二期,2003,第365-395頁。
2. [7] 林宛瑩、許崇源,「台灣集團企業之控股形態及公司治理衡量指標之研究與建議」,交大管理學報,第二十八卷,第一期,2008,第269-312頁。
3. [1] 江向才、楊慧蘭,「台灣資訊電子產業研發投資與財務政策關聯性之實證研究」,仁德學報,第二期,2003,第207-229頁。
4. [12] 俞海琴,「內部人士持股比率與融資策略關係之實證研究」,管理評論,第十三卷,第二期,1994,第109-131頁。
5. [5] 林欣美、郭麗華、蘇迺惠,「國際化程度、董事會結構、精練法人監督對盈餘管理之影響 : 以台灣資訊電子業為例」,臺大管理論叢,第十九卷,第一期,2008,第1-31頁。
6. [4] 金成隆、陳俞如,「公司治理與專利權 : 台灣新興市場」,管理學報,第二十三卷,第一期,2006,第99-124頁。
7. [18] 張瑞當、方俊儒、曾玉琦,「核心代理問題與盈餘管理 : 董事會結構與外部監督機制之探討」,管理學報,第二十四卷,第一期,2007,第17-39頁。
8. [21] 陳瑞斌、許崇源,「公司治理結構與資訊揭露之關聯性研究」,交大管理學報,第二十七卷,第二期,2007,第55-109頁。
9. [28] 劉維琪、李怡宗,「融資順位理論之調查研究」,管理評論,第十二卷,1993,第119-143頁。
10. [29] 劉韻僖,「台灣高科技產業高階經營團隊及董事會權力與組織績效關係之研究」,交大管理學報,第二十六卷,第一期,2006,第173-200頁。
11. 江嘉琪, 公私法混合契約初探―德國法之觀察,中原財經法學,第 9 期,2002 年 12 月。
12. 柯木興、林慧芬,「談社會保險與自助餐(buffet)」,國家政策論壇,2003年,10月。
13. 郝鳳鳴,「全民健康保險醫療費用協商制度之法規範評析」,法律評論,64 卷 1-3 期合刊,1998年。
14. 黃肇明、朱澤民,全民健保「藥價黑洞」知多少,主計月刊,第637期,2009年1月。
15. 蔡甫昌,醫師與廠商的關係,健康世界 2004年;348。