研究生(外文):Ming-Shuan Li
論文名稱(外文):Design and Realization Low Complexity High Quality H.264 Video Encoder on Programmable Processors
指導教授(外文):Jiun-In Guo
由於H.264 視訊壓縮技術具有低傳輸量和高畫質的特性,在近來的行動多媒體產品上,H.264 視訊壓縮技術扮演一個很重要的腳色;但是,由於H.264 視訊壓縮技術具有較高運算複雜度,相對地也必須消耗較多計算量。 目前,越來越多的RISC核心可以提供較低電力的消耗以及提供新的延伸多媒體指令,因此RISC核心被廣泛地使用到行動多媒體應用上,而設計一個和標準規格相容的H.264視訊編碼器且能在RISC處理器上處理嵌入式應用並具有即時效能,其具有高度的市場價值性。本篇論文說明將H.264視訊編碼器移植到UniCore平台的過程以及提出在演算法層次上改善H.264視訊編碼器效能的方法,更進一步地,根據UniCore平台的特色,提出與平台相關改善效能的方法。最後,當採用上述的方法之後,相較於H.264 JM軟體, 我們可以提升H.264視訊編碼器21倍的效能。
In mobile multimedia products, H.264 video compression plays an important role due to its features of low bit-rate and high quality. But, H.264 video encoder consumes more power because of its high computation complexity. Currently, more and more RISC cores are widely used in mobile multimedia applications because of their low power consumption and multimedia extensions. Design of fully standard-compliant H.264 video encoder with real time performance on a RISC processor for embedded applications entail optimizations to the maximum extent possible. This thesis describes on the process of porting H.264 video encoder on UniCore platform. We propose some methods in algorithm level optimization to improve the performance of H.264 video encoder. Further, this thesis proposes the platform-dependent optimization methods exploiting the features of UniCore platform. Finally, adopting the above methods, we can improve 21 times in performance on realizing the H.264 video encoder in UniCore platform operated at 200 MHz.
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 1
1.3 Thesis Organization 2
Chapter 2 Related Works 3
2.1 Previous Works 3
2.2 UniCore Platform 5
2.2.1 Hardware Architecture 5
2.2.2 Memory Architecture 6
2.2.3 Software Development Environment 6
2.3 JM Reference Code 7
Chapter 3 Reference Code Porting on Unicore Platform 8
3.1 Reference Code Modification 8
3.1.1 I/O Rewrite Method 8
3.1.2 Memory Management 11
3.1.3 New Additional Functions 14
3.2 Verification Method 15
3.3 Performance 18
3.3.1 Test Environment 18
3.3.2 Encoder Performance 18
Chapter 4 Software Algorithm-Based Optimization Methods 21
4.1 Inter Prediction Optimization Method 21
4.1.1 Two Stage Fast Search (TSFS) 22
4.1.2 Block Size Trend Prediction (BSTP) 23
4.1.3 Quadrant Prediction Fast Algorithm (QPFA) 24
4.1.4 Performance Analysis in Motion Estimation Search Algorithm 25
4.2 Intra Prediction Optimization Method 26
4.2.1 A Condition-based Intra Prediction (ACIP) 27
4.2.2 Predictive Mode Searching Policy (PMSP) 31 PMSP-based Mode-referred Search 31 PMSP-based Early-terminated 34
4.2.3 Performance Analysis in Intra Prediction Algorithm 35
4.3 Transform Coding and Interpolation Optimization Method 36
4.3.1 The optimization of interpolation 37 Improve Interpolation by Dividing Loop Range 37
4.3.2 Improved Early Detection Algorithm for All-Zero Block 39
4.4 Coding Style Promotion Method 40
4.4.1 Performance Analysis in transform coding and interpolation 41
4.5 Performance 42
4.5.1 Test Environment 42
4.5.2 Encoder Performance 42
Chapter 5 Processor Resource - Based Optimization Methods 45
5.1 CFU Firmware 45
5.1.1 SAD Firmware 46
5.1.2 DCT/IDCT Firmware 47
5.1.3 Firmware Performance 48
5.2 Parallel Mechanism 49
5.2.1 Firmware Interface 49
5.2.2 Module Scheduling 50
5.2.3 Module scheduling Performance 54 Test Environment 54 Performance 54
5.2.4 Performance 55 Test Environment 55 Performance 55
Chapter 6 Conclusion and Future Work 58
6.1 Conclusion 58
6.2 Future Work 58
Reference 59
