臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.11) 您好！臺灣時間：2025/09/24 18:40

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
電子全文
紙本論文
QR Code

本論文永久網址:

研究生:

盧志德

研究生(外文):

Chih-Te Lu

論文名稱:

多視角視訊編碼器中快速搜尋之NVIDIA CUDA平行實現

論文名稱(外文):

Multiview Encoder Parallelized Fast Search Realization on NVIDIA CUDA

指導教授:

楊士萱

、杭學鳴

口試委員:

簡韶逸、梁文耀

口試日期:

2010-07-09

學位類別:

碩士

校院名稱:

國立臺北科技大學

系所名稱:

資訊工程系研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2010

畢業學年度:

語文別:

英文

論文頁數:

中文關鍵詞:

多視角視訊、H.264/AVC、動作向量估計、視差向量估計、平行、CUDA、GPU、Multi-core、快速搜尋演算法

外文關鍵詞:

multiview video coding (MVC)、H.264/AVC、motion estimation、disparity estimation、parallel、CUDA、GPU、Multi-core、fast search algorithm

相關次數:

被引用:0
點閱:720
評分:
下載:5
書目收藏:0

由於繪圖晶片的快速發展，將繪圖晶片運用於非圖形的運算已漸漸成熟，使用GPU輔助CPU處理一般運算，此技術通稱為General-purpose computing on graphics processing units (GPGPU)，而NVIDIA公司在2007年提出一個全新GPGPU的繪圖處理器架構Compute Unified Device Architecture (CUDA)，藉由CUDA技術，可程式NVIDIA硬體多執行緒的GPU，以達到平行處理大量資料的運算，而我們的系統則採用NVIDIA GTX-280，其具有240個運算核心，作為我們實作平行演算法的實驗平台。
H.264/AVC正在進行的延伸標準multiview video coding (MVC)，其編碼器中最耗費運算時間的motion estimation (ME) 以及disparity estimation (DE)，我們提出一個可平行的快速演算法multithreaded one-dimensional search (MODS)，它可使用於ME以及DE，所以我們對編碼器中整數像素的ME以及DE實做MODS於NVIDIA GTX-280平台上，可加速約CPU版本的89倍，而使用CUDA加速的MODS與標準程式中的快速演算法相比，在使用ME與DE編碼的視訊也可加速達21倍。

Due to the rapid growth of the graphics processing unit (GPU) processing capability, it gets more and more popular to use it for non-graphics computations. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007, which is able to provide massive data parallelism under the SIMD architecture constraint. We use NVIDIA GTX-280 GPU system, which has 240 computing cores, as the platform to implement a very complicated video coding scheme.
The Multiview Video Coding (MVC) scheme, an extension of H.264/AVC/MPEG-4 Part 10 (AVC), is being developed by the international standard team joined by the ITU-T Video Coding Experts Group and the ISO/IEC JTC 1 Moving Pictures Experts Group (MPEG). It is an efficient video compression scheme; however, its computational compexity is very high. Two of its most time-consuming components are motion estimation (ME) and disparity estimation (DE). In this thesis, we propose a fast search algorithm, called multithreaded one-dimensional search (MODS). It can be used to do both the ME and the DE operations. We implement the integer-pel ME and DE processes with MODS on the GTX-280 platform. The speedup ratio can be 89 times faster than the CPU only configuration. Even when the fast search algorithm of the original JMVC is turned on, the MODS version on CUDA can still be 21 times faster.

Table of Contents

摘要 i
Abstract .ii
Acknowledgement iv
Table of Contents vi
List of Tables viii
List of Figures x
Chapter 1 INTRODUCTION 1
1.1 Introduction 1
1.2 Motivation 3
1.3 Overview of the Thesis 4
Chapter 2 MULTIVIEW VIDEO CODING STANDARD 5
2.1 Overview of MVC Encoder 5
2.2 Motion and Disparity Compensation 5
2.3 Prediction Structures 11
2.3.1 Hierarchical B Pictures 11
2.3.2 Prediction Structures for Multi-view Video Coding 13
2.4 Reference Software 18
Chapter 3 COMPUTER UNIFIED DEVICE ARCHITECTURE (CUDA) 20
3.1 General-Purpose Computation on GPUs 20
3.1.1 GPU versus CPU 20
3.1.2 Overview of CUDA 23
3.2 Hardware Architecture of GT200 24
3.3 SIMT Programming Model 31
3.4 Mechanism of Scheduler 32
3.5 Performance Tuning 34
Chapter 4 MVC ENCODER ACCELERATION BY CUDA 36
4.1 Full-Search Motion Estimation 36
4.2 Multithreaded One-Dimensional Search 37
4.3 MODS Parallelized Implementation on NVIDIA CUDA 38
4.4 Maximize Parallel Execution 43
4.5 Data Dependency Problem 45
4.6 Encoder Implementation 47
4.7 Simulation Result 51
Chapter 5 CONCLUSIONS 66
5.1 Conclusions 66
5.2 Future Work 66
REFERENCES 68

REFERENCES

[1]A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” in Proc. IEEE, vol. 93, no. 1, pp. 98–110, Jan. 2005.
[2]T. Fuji and M. Tanimoto, “Free-Viewpoint TV Systems Based on Ray-Space Representation,” in Proc. of SPIE, vol. 4864, pp. 175-189, Nov. 2002.
[3]Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, ITU-T Recommendation H.264 and ISO/IEC 14496-10 Std., 2003.
[4]Y. -S. H, and K. -J. O, "Overview of Multi-view Video Coding", IEEE International Conference on Signals and Image Processing, Multimedia Communications and Services, pp. 5-12, 2007
[5]A. Smolic, P. Merkle, K. M‥uller, C. Fehn, P. Kauff, and T. Wiegard, “Compression of multi-view video and associated data,” in Three Dimensional Television:Capture, Transmission, and Display, eds. H. Ozaktas and L. Onural, Springer, New York, 2007.
[6]E. Martinian, A. Behrens, J. Xin, A. Vetro, and H. Sun, “Extensions of H.264/AVC for multiview video compression,” in IEEE Int. Conf. on Image Processing, Atlanta, USA, Oct. 2006.
[7]L. Ding, P. Tsung, S. Chien, W. Chen, and L. Chen, "Content-Aware Prediction Algorithm With Inter-View Mode Decision for Multiview Video Coding", in IEEE Transactions on Circuits and Multimedia, vol. 10, no. 8, pp. 1553-1564, 2008.
[8]CUDA GPUs, http://www.nvidia.com/object/cuda_gpus.html
[9]W. -N. Chen, H. -M. Hang, “H.264/AVC motion estimation implementation on Compute Unified Device Architecture (CUDA)”, IEEE International Conference on Multimedia and Exposition, pp. 697-700, 2008.
[10]L. Chan, J. Lee, A. Rothberg, and P. Weaver, “Parallelizing H.264 Motion Estimation Algorithm using CUDA”, MIT IAP, 2009.
[11]B. Pietersa, C. F. Hollemeersch, P. Lambert, and Rik Van de Walle, “Motion Estimation for H.264/AVC on Multiple GPUs Using NVIDIA CUDA”, SPIE, vol. 7443, 2009.
[12]Y.-L. Huang, Y.-C. Shen and J.-L. Wu, “Scalable computation for spatially scalable video coding using NVIDIA CUDA and multi-core CPU”, Proceedings of the seventeen ACM international conference on Multimedia, pp. 61-370, 2009.
[13]P. Merkle, K. Muller, A. Smolic, and T. Wiegand, Efficient Compression of Multi-view Video Exploiting Inter-view Dependencies Based on H.264/MPEG4-AVC, IEEE International Conference on Multimedia and Exposition, Toronto, Ontario, Canada, July 2006.
[14]P. Merkle, A. Smolic, K. Muller, T. Wiegand, “Efficient Prediction Structures for Multiview Video Coding,” IEEE trans. on circuits and systems for video technology, vol. 17, no. 11, , pp. 1461-1473, Nov. 2007.
[15]H. Schwarz, D. Marpe, and T. Wiegand, “Analysis of hierarchical B pictures and MCTF”, IEEE International Conference on Multimedia and Exposition, Toronto, Ontario, Canada, July 2006.
[16]H. Pan and F. pan, “Development of multi-view video coding using hierarchical B pictures,” IEEE Congress on Image and Signal Processing, pp. 497-502, June 2008.
[17]Y. Chen, P. Pandit, and S. Yea, “WD 4 reference software for MVC,” ISO/IEC JTC/ISC29/WG11 and ITU-T Q6/SG16, Doc. JVT-AD207, Jan. 2009 (JMVC)
[18]http://thenextwavefutures.wordpress.com/2009/08/02/the-end-of-moores-law/
[19]http://en.wikipedia.org/wiki/Moore''s_law
[20]NVIDIA CUDA Programming Guide Version 2.2 4/2/2009
[21]http://www.xcpus.com/reviews/87-Unleash-the-Beast-Core-i7-and-X58-Page-1.aspx
[22]http://www.realworldtech.com/page.cfm?ArticleID=RWT090808195242
[23]NVIDIA CUDA, http://www.nvidia.com/object/cuda_home_new.html
[24]http://en.wikipedia.org/wiki/GPGPU
[25]NVIDIA The CUDA Compiler Driver NVCC Version 2.2 3/26/2009
[26]http://www.realworldtech.com/page.cfm?ArticleID=RWT090808195242&p=7
[27]W.-M. Hwu, and D. Kirk, ECE 498 AL1 Programming Massively Parallel Processors Lecture, University of Illinois at Urbana-Champaign, 2007. http://courses.ece.uiuc.edu/ece498/al1/
[28]NVIDIA CUDA C Programming Best Practices Guide Version July 2009.
[29]L.-G. Chen, W.-T. Chen, Y.-S. Jehng, and C.-T. Church, “An efﬁcient parallel motion estimation algorithm for digital image processing,” IEEE Trans. Circuits and Systems for Video Tech, vol. 1, pp. 378–385, Dec. 1991.
[30]M. Harris, “Optimizing Parallel Reduction in CUDA,” NVIDIA Developer Technology, 2007.
[31]I.E.G. Richardson, H.264 and MPEG-4 Video Compression, John Wiley & Sons, 2003.
[32]T. Wiegand, G. J. Sullivan, G. Biontegaard, and A. Luthra, “Overview of the H.264/AVC Video Coding Standard,” IEEE Transactions on Circuit and System for Video Technology, vol. 13, Issue 7, pp. 560-576, Jul. 2003.
[33]Tourapis, H.-Y. Cheong; Tourapis, A.Michael, “Fast motion estimation within the H.264 codec”, IEEE International Conference on Multimedia and Exposition, July 2003.
[34]J. Vieron, M. Wien, and H. Schwarz, “JSVM 9 Software”, Dec. 2008.

電子全文

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	應用在基於QEMU的多核心系統平行模擬器的分離式二進制程式碼快取模型
2.	基於移動視差補償之立體視訊畫框插補研究
3.	使用編碼區塊樣式之多視角視訊快速編碼演算法
4.	運用圖形處理器增進計算巴黎選擇權價格的效能
5.	使用圖形處理單元加速時域有限差分法的計算
6.	利用圖形處理器加速射線追蹤演算法應用於醫學成像之研究
7.	在多核心圖形處理器架構下實現一有效率的排序演算法
8.	在GPU上實作平行處理Bzip2資料壓縮演算法
9.	GPU應用於圖演算法之計算效益分析
10.	使用CUDA針對三維磁振頻譜影像做GPU加速
11.	應用於圖像處理器平行運算之C程式語言擴充設計與實作
12.	H.264降解析度轉換編碼中模式決策與移動再估測之研究
13.	在CUDA系統上兩階段任務排程方法
14.	利用圖形運算處理器加速磁振擴散張量影像計算
15.	車牌定位系統的平行化實現

1.	黃哲（民93）。教育民族誌的介紹。教育趨勢導報，10，47-56。
2.	謝光萍（2006年）。每天誕生10萬個新部落格傳統媒體告急。數位時代雙週刊，2006年12月15日。上網日期：2007年1月7日。檢自：http://mag.udn.com/mag/dc/storypage.jsp?f_MAIN_ID=2&f_SUB_ID=4&f_ART_ID=53920
3.	林珊如（2000）。圖書館使用者瀏覽行為之研究：瀏覽結果與影響因素之分析。圖書資訊學刊，15（民89年12月），37-68。
4.	王秀卿（民90）。網路使用與資訊尋求行為之文獻探討。大學圖書館，5(1)，144-162。
5.	高世樺、林頌堅（2008）。部落格使用者的資訊行為特徵研究。圖書館學與資訊科學，34(1)，62-77。
6.	陳志萍（2008, June）。精進網路研究方法--網路民族誌。圖書資訊學研究，2(2)，1-15。
7.	黃慕萱（民90）。成人讀者之資訊尋求行為。臺北市立圖書館館訊，19(2)，9-19。
8.	林麗娟（1997）。質化研究取向與讀者資訊尋求。資訊傳播與圖書館學，4(1)，52-59。
9.	林珊如（民93）。臺灣地區國小鄉土教育教師資訊需求與搜尋行為之探討：質的訪談。圖書館學與資訊科學，30(2)，116-133。
10.	傅雅秀（1997）。資訊尋求的理論與實證研究。圖書與資訊學刊，20，13-28。
11.	林珊如、劉應琳（民90）。休閒閱讀找書策略與影響因素之探討：以台大BBS Books版愛書人為例。資訊傳播與圖書館學，8(2)，23-37。
12.	林珊如（民91）。網路使用者特性與資訊行為研究趨勢之探討。（台大）圖書資訊學刊，17，35-47。
13.	張郁蔚（民94）。日常生活資訊尋求模式之探討。國家圖書館館刊，94(2)，73-99。
14.	林珊如（民94）。深度休閒與資訊行為研究。（台大）圖書資訊學刊，3，15-22。
15.	王宏德（民98）。資訊傳播與虛擬社群專題探討(1)--Web 2.0在圖書館服務之理念與應用。全國新書資訊月刊，126，8-12。上網日期：2010年7月7日。檢自：http://isbnfax.ncl.edu.tw/isbn/admin/pdf/980612608.pdf

1.	可調式視訊編碼器於NVIDIACUDA之平行演算與實現
2.	H.264/AVC視訊編碼器於NVIDIACUDA之平行演算與實現
3.	MPEG視點合成參考軟體於NVIDIA CUDA之加速與改進
4.	基於子空間及獨立元素分析之聲源定位
5.	建築物附設停車空間使用人員特性及防火避難設施調查研究-以臺北縣市量販店為例
6.	在CUDA系統上兩階段任務排程方法
7.	利用Matlab/Simulink環境開發Ros系統整合Nvidia Jetson Tx2與Pixhawk控制模組實現自駕車車道維持控制
8.	透過Kafka 傳輸在Nvidia Jetson NX 上進行煙火偵測
9.	在NVIDIA圖形處理器上管理暫存器以增加線程級並行處理
10.	CUDA平行加速運算技術應用於光學薄膜特性之摹擬
11.	HFO冷媒應用於準二級空氣源熱泵熱水系統製熱性能之研究
12.	虛擬資訊整合架構之研究-以企業併購為例
13.	閩南式寺廟古蹟建築火災潛勢落點之研究-以臺北縣新莊市寺廟古蹟為例
14.	應用改良式位能數值理論於迷宮機器鼠路徑演譯法之研究
15.	基於寬幅調變函數之適應性模糊滑動模式控制

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室