(3.236.228.250) 您好!臺灣時間:2021/04/17 12:52
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黃世緯
研究生(外文):Shih-Way Huang
論文名稱:針對MPEG2/4AAC和HEAAC音訊編解碼器的核心技術設計
論文名稱(外文):Key Technology Design of Audio Codecs for MPEG-2/4 AAC and HE AAC
指導教授:陳良基陳良基引用關係
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:134
中文關鍵詞:音訊編解碼器核心技術
外文關鍵詞:Audio CodecsMPEGAACHE AAC
相關次數:
  • 被引用被引用:1
  • 點閱點閱:353
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
數位音訊編碼技術已在我們生活中的娛樂和通訊扮演重要的角色。在這篇博士論文,提出針對最先進的音訊編碼標準─MPEG-2/4 AAC(先進音訊編碼)和它的延伸MPEG-4 HE AAC(高效率先進音訊編碼)的核心技術設計。為了可以實現在低複雜度的應用上─如具有錄音和播放音樂功能的可攜式產品,將討論降低AAC編碼器和HE AAC解碼器的計算量。本篇論文分為兩部份。

第一部份針對MPEG AAC編碼,提出低運算量、低記憶體的PAM(聽覺心理學模型)。PAM是MPEG AAC編碼器內的核心技術。包含了許多複雜的數學函數來描述人耳的聽覺系統。因此,挑戰在於即要降低運算量和記憶體,同時也要維持聲音的品質。設計的主要觀念是將這些複雜函數轉換為簡化的查表和共同的函數,以及替換掉不必要的運算。此外,修改偵測和決定的方法來提升聲音的品質。PAM的複雜度可以降低為原來的12.2%(降低了87.8%)。這結果可以使得即時的MPEG-2/4 AAC編碼器在規格為低複雜度、立體聲道、128 kb/s(每秒一千個位元)的位元率下,運算量低於20 MOPS(每秒一百萬個運算),並且具有CD品質的聲音。


在第二部份,提出針對HE AAC解碼器的低功率版SBR(頻段複製)方法內的快速QMF(正交鏡像濾波器段)。QMF是HE AAC解碼器的核心技術。設計的主要觀念是將QMF內佔大量運算量的矩陣運算轉換成常見的快速DCT(離散餘弦轉換)。因此,運算量可以被降至原本乘法的2.7%和加法的7.8%。

我們相信不久的未來在我們周圍將有許多這些音訊編碼標準的應用。這篇研究將可提供幫助。
Digital audio coding technology has played an important role in our daily life for entertainment and communication. In this dissertation, key technology designs for the state-of-the-art audio coding, MPEG-2/4 Advanced Audio Coding (AAC) and its extension MPEG-4 High Efficiency Advanced Audio Coding (HE AAC), are proposed. In order to achieve the goal of low complexity applications such as portable devices with audio playback and recording, study on the reduction of the complexity of AAC encoders and HE AAC decoders are discussed. The dissertation is divided into two parts.

The first part presents a low computation, low memory Psycho-acoustic Model (PAM) for MPEG AAC encoding. PAM is the key technology in the MPEG AAC encoder. It has various complicated functions to model the human auditory system. Therefore, the challenge is to reduce the computation and memory while maintaining the sound quality. The main concept of this work is based on the conversion of complicated functions into optimized look-up tables and common functions, and on the replacement of the computation that is unnecessary. Besides, the detection and decision method is modified to improve sound quality. The complexity of the proposed PAM is reduced to 12.2% (by 87.8%), and this design can lead to a real-time MPEG-2/4 Low Complexity profile stereo encoder at 128 kb/s below 20 MOPS with CD quality maintained.

In the second part, fast Quadrature Mirror Filterbank (QMF) in the Low power Spectral Band Replication (SBR) tools for the MPEG HE AAC decoder is derived. QMF is the key technology of the HE AAC decoder. The main concept of this work is to transform the computation-intensive matrix operations in QMF into conventional fast Discrete Cosine Transform (DCT). Therefore, the computational complexity can be reduced up to 2.7% and 7.8% with respect to the original multiplications and additions.

We are convinced that there will be many applications around us with these audio coding standards in the near future. This study can be of great benefit.
Contents


Abstract 7
1. Introduction 9
1.1. Application 9
1.1.1. Digital Audio Coding 9
1.1.2. MPEG AAC and HE AAC 10
1.2. Motivation 11
1.3. Problem Definition 12
1.4. Challenges 13
1.5. Contributions 13
1.5.1. Low Computation, Low Memory PAM for MPEG AAC Encoding 14
1.5.2. Fast Filterbank for HE AAC Decoding 14
1.6. Dissertation Organization 18
2. Background 19
2.1. Digital Audio Coding: Why, What, and How 19
2.1.1. Why 19
2.1.2. What 20
2.1.3. How (Principle of Digital Audio Coding) 22
2.2. History of MPEG Audio Coding Standards 24
2.2.1. MPEG-1 27
2.2.2. MPEG-2 27
2.2.3. MPEG-4 AAC 28
2.2.4. MPEG-4 HE AAC (Bandwidth Extension) 29
2.2.5. Comparison between the Audio Codecs 30
3. Low Computation, Low Memory PAM for the MPEG AAC Encoder 35
3.1. Introduction 35
3.2. Analyses of AAC Algorithms 38
3.3. PAM Algorithms 41
3.4. Challenges 45
3.4.1. Computation 45
3.4.2. Memory 46
3.4.3. Quality 46
3.5. Previous Works 48
3.5.1. MDCT-based PAM 48
3.5.2. 32-b logarithmic data format 49
3.6. Proposed Design 52
3.6.1. Method 1 - Pre-Computed Masking Spreading 52
3.6.2. Method 2 - Modified MDCT-based PAM 55
3.6.3. Method 3 - Reduced Table of Spreading Function 64
3.6.4. Method 4 – Logarithm-based PAM 69
3.7. Experiments and Results 77
3.7.1. Assessment of Sound Quality 77
3.7.2. Profiling the Reduction Rate (Method 1+2) 78
3.7.3. Word Length of the Reduced Spreading-Function Table (Method 3) 79
3.7.4. Quality Degradation by 16-bit Logarithmic Format (Method 4) 80
3.7.5. Encoding Time (Method 1-4) 83
3.7.6. Encoding Quality (Method 1-4) 83
3.8. Summary 94
4. Fast Decomposition of Filterbanks for the MPEG-4 HE AAC Decoder 97
4.1. Introduction 97
4.2. Algorithm Review 98
4.3. Profiling 98
4.4. Problem Definition 101
4.5. Previous Works on Fast Filterbanks 102
4.6. Review on Conventional DCT Types 102
4.7. Development Methods 103
4.7.1. AQMF 104
4.7.2. SQMF 107
4.7.3. Downsampled SQMF 110
4.8. Performance 117
4.9. Summary 118
5. Conclusion 121
5.1. Principal contributions 121
5.1.1. Low Computation, Low Memory PAM for MPEG AAC Encoding 121
5.1.2. Fast Filterbank for HE AAC Decoding 122
5.2. Future directions 123
5.2.1. Toward applications of lower bit rate audio coding 123
5.2.2. Toward applications of scalable audio coding 123
5.2.3. Toward applications of high-definition audio coding 123
Bibliography 125
Publication 133



List of Figures

Figure 1 1. The interests of this dissertation. 15
Figure 1 2. The complexity profiling of the MPEG AAC encoder. PAM is the key technology in the encoder. 16
Figure 1 3. The complexity profiling of the MPEG HE AAC decoder. The filterbanks in SBR are the key technology in the encoder. 16
Figure 2 1. Block diagram of the perceptual audio codec. 24
Figure 2 2. Parametric coding in combination with perceptual coding (core). 24
Figure 2 3. The compression ratio of the significant MPEG audio coding standards. 26
Figure 2 4. Results of the AAC stereo verification tests [20]. The horizontal axis stands for the audio encoder, (profile, if any) and bitrate. The vertical axis represents the sound quality. 0.0 represents that tested quality is imperceptible to the reference. The smaller the diffscores (difference), the better the quality. 31
Figure 2 5. AAC quality comparison [20][31]. The horizontal axis stands for the bitrate. The vertical axis represents the sound quality. 0.0 represents that tested quality is imperceptible to the reference. The smaller the value, the better the quality. 32
Figure 2 6. HE AAC verification tests [20]. The horizontal axis represents AAC at 48 kb/s, AAC at 60 kb/s, 3.5 kHz low-pass-filtered Hidden Reference, 7 kHz low-pass-filtered Hidden Reference, HE AAC with High Quality SBR at 32 kb/s, HE AAC with Low Power SBR at 32 kb/s, HE AAC with High Quality SBR at 48 kb/s, HE AAC with Low Power SBR at 48. The vertical axis represents the MUSHRA [30] scores. 100 stands for the quality of reference. The higher the score, the better the quality. 33
Figure 2 7. Sound quality comparison from the European Broadcasting Union testing at 48 kb/s stereo between MP3, AAC, HE AAC (alias aacPlus), and other encoders [35]. The vertical axis represents the MUSHRA [30] scores. 100 stands for the quality of reference. The higher the score, the better the quality. 34
Figure 3 1. Block diagram of an AAC encoder. 40
Figure 3 2. Block diagram of the PAM in [4]. 42
Figure 3 3. Detailed block diagram of the PAM from the 13 steps in [4]. 44
Figure 3 4. Pre-Echoes resulting from processing blocks of 2048 samples [56]. The top figure shows the original signal, the middle shows the re-quantized signal with the pre-echo, and the bottom shows difference signal between the original and the re-quantized signal. 48
Figure 3 5. Original PAM, including two set of FFT and Threshold Generation (TG). 50
Figure 3 6. MDCT-based PAM, replacing FFT spectrums with MDCT spectrums. 51
Figure 3 7. Signal path in the AAC encoder. 51
Figure 3 8. Spreading Function. 54
Figure 3 9. Pseudo code of the spreading function. 54
Figure 3 10. Concept of the proposed modified MDCT-based PAM. 58
Figure 3 11. Block diagram of the proposed modified MDCT-based PAM. 58
Figure 3 12. Block diagram of MDCT 1 and Threshold Generation 1. 59
Figure 3 13. Waveform view of a series of castanets. The vertical axis is the amplitude, and the horizontal axis is time. Each surge (attack) is a castanet. 61
Figure 3 14. PE versus Frame number from the sound of Figure 3 13. The vertical axis is the magnitude of PE, and the horizontal axis is the frame number along time. If PE is larger than the threshold, the frame is considered to be attacked. 61
Figure 3 15. Waveform view of a pop music. The vertical axis is the amplitude, and the horizontal axis is time. 62
Figure 3 16. PE versus Frame number from the sound of Figure 3 15. The vertical axis is the magnitude of PE, and the horizontal axis is the frame number along time. Attacks (transients) are falsely detected because those PE are larger than the threshold. 62
Figure 3 17. PE versus Frame number from the sound of Figure 3 13. The vertical axis is the magnitude of PE, and the horizontal axis is the frame number along time. Each big surge of PE corresponds to the surge (attack) in Figure 3 13. 63
Figure 3 18. The distribution of zero values and non-zero values. 67
Figure 3 19. The proposed Method 3 by storage in two arrays. 68
Figure 3 20. Complex functions in PAM by Method 4 (Logarithm-based PAM) (a) Before Method 4. (b) After Method 4. 72
Figure 3 21. Energy and threshold stored in logarithmic format accordingly and naturally. 73
Figure 3 22. Signal path in the AAC encoder with proposed PAM. 73
Figure 3 23. The scales of Objective Difference Grade (ODG). 78
Figure 3 24. Stereo waveform snapshot of the sound preech01 (at time = 5.82 s). (a) Uncompressed sound. (b) Compressed by the original FFT-based PAM. (c) Compressed without block switching (LONG block only). (d) Compressed by the proposed PAM. Note that because of lack of block decision, (c) is worse than (b) and (d). The original (b) and the proposed (d) both can detect the attacks (transients) correctly. 88
Figure 3 25. (a), (b) are the left-channel waveform and spectral views encoded by the original FFT-based PAM, and (c), (d) are the left-channel waveform and spectral views encoded by the proposed PAM. They are almost the same. 93
Figure 4 1. Block diagram of the HE AAC decoder with the Low power SBR decoder. 99
Figure 4 2. Profiling of the HE AAC decoder with the Low power SBR. 100
Figure 4 3. Profiling of the HE AAC decoder with the Low power SBR using downsampled SQMF. 100
Figure 4 4. The decomposition of the matrix operation in AQMF. 112
Figure 4 5. The decomposition of the matrix operation in SQMF. 112
Figure 4 6. The decomposition of the matrix operation in downsampled SQMF. 113
Figure 4 7. Signal flow graph of the proposed decomposition in AQMF. 114
Figure 4 8. Signal flow graph of the proposed decomposition in AQMF. 115
Figure 4 9. Signal flow graph of the proposed decomposition in downsampled SQMF. 116



List of Tables

Table 1 1. Applications of MPEG AAC and HE AAC. (‘*’ stands for combination with Parametric Stereo (PS) [14]). 11
Table 1 2. Processing power requirement of MP3 and AAC codecs [15]. 17
Table 1 3. Complexity ratio of HE AAC and plain AAC. 17
Table 2 1. Various classes of audio. 20
Table 2 2. Important factors for a given digital audio coder. 22
Table 3 1. Computational complexity analyses of the AAC LC stereo encoder. 40
Table 3 2. The functional descriptions of the 13 steps in PAM. 43
Table 3 3. Comparisons between FFT and MDCT. 50
Table 3 4. Computational complexity analyses of PAM. 54
Table 3 5. Comparison of different MDCT-based PAM. 60
Table 3 6. Reduction rate of computational complexity 60
Table 3 7. The number of zero and non-zero values in the table of spreading function (sampling rate 44100 Hz). 66
Table 3 8. The number of values for storage (sampling rate 44100 Hz). 66
Table 3 9. The reduction of table’s size at different sampling rates. 69
Table 3 10. The original PAM vs. the proposed PAM (Method 4). 74
Table 3 11. Comparison of computational complexity and required look-up tables. 75
Table 3 12. Comparison of data memory storage and bandwidth in Threshold Generation. 75
Table 3 13. The reduction of the proposed PAM. 76
Table 3 14. Computational complexity of the AAC encoder after optimization (128 kb/s, LC Stereo). 76
Table 3 15. Comparison between the proposed PAM and the previous low power work [42]. 76
Table 3 16. Tested audio files and their characteristics. 81
Table 3 17. Simulated reduction rates by the proposed PAM. 81
Table 3 18. The word length reduction vs. sound quality degradation (sampling rate 44100 Hz). 82
Table 3 19. Quality degradation by 16-bit logarithmic format of energy and masking threshold. 82
Table 3 20. Encoding time of the encoder with the proposed PAM. 86
Table 3 21. Encoding quality in ODG with or without block switching. 89
Table 3 22. Encoding quality in ODG with or without block switching. 89
Table 3 23. Audio excerpts from common audio CD. 89
Table 3 24. Comparison of encoding quality in ODG. 90
Table 3 25. Comparison of encoding quality in NMR. 91
Table 4 1. Reduction of computational complexity in the proposed QMF. 119
Table 4 2. Reduction of computational complexity in the proposed QMF with downsampled SQMF. 119
Bibliography
[1] MPEG. Coding of moving pictures and associated audio for digital storage media at up to 1.5 Mbit/s, part 3: Audio, International Standard IS 11172-3, ISO/IEC JTC1/SC29 WG11, 1992.
[2] MPEG. Information Technology – generic coding of moving pictures and associated audio, part 3: Audio, International Standard IS 13818-3, ISO/IEC JTC1/SC29 WG11, 1994.
[3] MPEG. MPEG-2 Advanced Audio Coding, AAC, International Standard IS 13818-7, ISO/IEC JTC1/SC29 WG11, 1997.
[4] MPEG. Information technology – Coding of audio-visual objects – Part 3: Audio, International Standard IS 14496-3, ISO/IEC JTC1/SC29 WG11, 1999.
[5] MPEG. Information technology – Coding of audio-visual objects – Part 3: Audio, Amendment 1: Bandwidth extension. ISO/IEC 14496-3:2001/Amd. 1:2003, Nov. 2003.
[6] ARIB. Available: http://www.arib.or.jp/english/
[7] ISMA. Available: http://www.isma.tv
[8] DVD Audio. Available: http://www.dvdforum.org/
[9] XM Radio. Available: http://www.xmradio.com/corporate_info/ fast_facts_sound.html
[10] Digital Radio Mondiale (DRM). Available: http://www.drm.org/
[11] 3GPP forum. Available: http://www.3gpp.org/
[12] DVB forum. Available: http://www.dvb.org/
[13] Apple iTunes and iPod. Available: http://www.apple.com/
[14] MPEG. Information technology – Coding of audio-visual objects – Part 3: Audio, Amendment 2: Parametric coding for high-quality audio. ISO/IEC 14496-3:2001/Amd 2:2004.
[15] Fraunhofer IIS. Table 1: Minimum requirement specification for MPEG Layer-3 and MPEG-4 AAC codecs on 16 and 32 bit processors. [Online]. Available: http://www.iis.fraunhofer.de/amm/techinf/audio_dsp/cdk/ platfoms.html
[16] ARM. [Online]. Available: http://www.arm.com/products/CPUs/ ARM926EJ-S.html
[17] ISO/IEC JTC 1/SC 29. [2004, Mar.] N5897. [Online] http://www.itscj.ipsj.or.jp/sc29/open/29view/29n5897c.htm
[18] J. D. Johnston, “Transform coding of audio signals using perceptual noise criteria,” IEEE Journal on Selected Areas in Communications, Vol. 6, No 2, pp. 314-323, Feb., 1988.
[19] M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, Y. Oikawa, “ISO/IEC MPEG-2 Advanced Audio Coding,” Journal of the Audio Engineering Society, vol. 45, no. 10, pp. 789-814, Oct. 1997.
[20] ISO/IEC JTC 1/SC 29/WG 11, “MPEG audio codecs (history & tools),” ISO/IEC JTC 1/SC 29/WG 11 N7154, April 2005.
[21] J. Herre, B. Grill, G. Zoia, “MPEG-4 audio: Basics and extensions,” slides in WEMP4 2002 tutorial.
[22] ISO/IEC JTC1/SC29/WG11. [2000, July] “Call for evidence justifying the testing of audio coding technology,” ISO/IEC JTC1/SC29/WG11 N3483. [Online]. Available: http://www.tnt.uni-hannover.de/project/ mpeg/audio/public/w3483.pdf
[23] ISO/IEC JTC1/SC29/WG11. [2001, Jan] “Call for proposals for new tools for audio coding,” ISO/IEC JTC1/SC29/WG11 N3794. [Online]. Available: http://www.tnt.uni-hannover.de/project/mpeg/audio/public/ w3794.pdf
[24] Coding Technologies. Available: http://www.codingtechnologies.com/
[25] M. Dietz, L. Liljeryd, K. Kjorling, O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” Conventional Paper 5553, Presented at the 112th Audio Engineering Society (AES) Convention, May 2002.
[26] P. Ekstrand, “Bandwidth extension of audio signals by spectral band replication,” Proc.1st IEEE Benelux Workshop on Model based Processing and Coding of Audio (MPCA-2002), Leuven, Belgium, November 15, 2002. pp. 53-58.
[27] M. Wolters, K. Kjorling, D. Homm, H. Purnhagen, "A closer look into MPEG-4 High Efficiency AAC," Conventional Paper 5871, Presented at the 115th Audio Engineering Society (AES) Convention, Sep. 2003.
[28] ITU-R Recommendation BS.1116-1, “Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems,” 1997.
[29] ITU-R, Recommendation BS. 562-3, "Subjective assessment of sound quality,” 1990.
[30] ITU-R, Multi stimulus test with hidden reference and anchor (MUSHRA) - EBU method for subjective listening tests of intermediate audio quality [10-11Q/62], 2000. Updated in ITU-R, Recommendation BS. 1534-1, “Method for the subjective assessment of intermediate quality levels of coding systems,” 2003.
[31] G. A. Soulodre et al.: "Subjective evaluation of state-of-the-art two-channel audio codecs," Journal of the Audio Engineering Society, vol. 46, no. 3, pp 164-177, March 1998.
[32] T. Painter, A. Spanias, “Perceptual coding of digital audio,” Proceeding of the IEEE, vo1. 88, no. 4, pp. 451-513, Apr. 2000.
[33] EBU subjective listening test at 48 kbps stereo. [Online]. Available: http://www.codingtechnologies.com/products/aacPlus.htm
[34] K. Brandenburg, “MP3 and AAC explained,” AES 17th International Conference on High Quality Audio Coding, Italy, Sep. 2-5, 1999.
[35] MPEG-4 Version 1 Reference Software (ISO/IEC 14496-5:2000). [Online]. Available: http://www.tnt.uni-hannover.de/project/mpeg/ audio/ftp/
[36] Y. Takamizawa, T. Nomura, M. Ikekawa, “High-quality and processor-efficient implementation of an MPEG-2 AAC encoder,” in Proc. of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 985 –988.
[37] D. H. Kim, D. H. Kim, J. H. Chung, “Optimization of MPEG-4 GA AAC on general PC,” in Proc. of the 44th IEEE 2001 Midwest Symposium on Circuits and Systems, vol. 2, pp. 923-925.
[38] I. Dimkoviae, D. Milovanoviae, Z. Bojkoviae, “Fast software implementation of MPEG advanced audio encoder,” in Proc. of the 2002 14th International Conference on Digital Signal Processing, vol. 2, pp. 839 –843.
[39] D. Huang, X. Gong, D. Zhou, T. Miki, S. Hotani, “Implementation of the MPEG-4 Advanced Audio Coding encoder on ADSP-21060 SHARC,” in Proc. of the 1999 IEEE International Symposium on Circuits and Systems, vol. 3, pp. 544 –547.
[40] Y. Takamizawa, T. Okumura, T. Nomura, M. Ikekawa, and I. Kuroda, “20mW MPEG-2/4 AAC LC stereo encoder on a 16-bit DSP,” presented at the Workshop and Exhibition on MPEG-4, San Jose, California, June 25-27 2002.
[41] C. Liu, W. Lee, C. Yang, K. Peng, T. Chiou, T. Chang, Y. Hsiao, H. Hue and C. Chien, “Design of MPEG-4 AAC encoder,” Convention paper 6201, presented at the Audio Engineering Society (AES) 117th Convention, Oct. 28-31, 2004.
[42] M. Gayer, M. Hartl, J. Hilpert, M. Lutzky, “Embedded audio codecs,” presented at the International Symposium on Consumer Electronics (ISCE) 2002.
[43] M. Gayer, M. Lohwasser, M. Lutzky, “Implementing MPEG Advanced Audio Coding and Layer-3 encoders on 32-bit and 16-bit fixed-point processors,” presented at the AES 115th Convention, New York, Oct. 10-13, 2003.
[44] T. Tsai, S. Huang, L. Chen, “Design of a low power psychoacoustic model co-processor for MPEG-2/4 AAC LC stereo encoder,” in Proc. of the 2003 IEEE International Symposium on Circuits and Systems, vol. 2, May 25-28, 2003, pp. 552 –555.
[45] C. Liu, C. Chen, W. Lee, S. Lee, “A fast bit allocation method for MPEG layer III,” in Proc. of the 1999 IEEE International Conference on Consumer Electronics, pp. 22-23.
[46] C. Liu, W. Lee, R. Hong, “A new criterion and associated bit allocation method for current audio coding standards,” in Proc. of the 5th Int. Conference on Digital Audio Effects (DAFX-02), Hamburg, Germany, Sep. 26-28, 2002, pp. 233-237.
[47] C. Liu, W. Lee, C. Chien, “Bit allocation for advanced audio coding using bandwidth-proportional noise-shaping criterion,” in Proc. of the 6th Int. Conference on Digital Audio Effects (DAFX-03), London, UK, Sep. 8-11, 2003.
[48] C. Yang, S. Chen, “New static and dynamic search algorithms for fast MP3 bit allocations,” in Proc. of the 2003 IEEE International Conference on Multimedia and Expo, vol. 1, pp. 77-80.
[49] H. Oh, J. Kim, C. Song, Y. Park, D. Youn, “Low power MPEG/audio encoders using simplified psychoacoustic model and fast bit allocation,” IEEE Transactions on Consumer Electronics, vol. 47, no. 3, pp. 613 –621, Aug. 2001.
[50] M. Kahrs, K. Brandenburg, Applications of digital signal processing to audio and acoustics. Kluwer Academic Publishers, 1998, p.59.
[51] P. Duhamel, Y. Mahieux, J. P. Petit, “A fast algorithm for the implementation of filter banks based on time domain aliasing cancellation,” in Proceedings of the 1991 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2209-2212.
[52] M. Bosi, Richard E. Goldberg, Introduction to digital audio coding and standards. Kluwer Academic Publisher Press, 2003, pp. 295-296.
[53] J. D. Johnston, “Estimation of perceptual entropy using noise masking criteria,” in Proc. of the 1988 International Conference on Acoustics, Speech, and Signal Processing (ICASSP-88), vol.5, pp. 2524 – 2527, April 11-14, 1988.
[54] “Perceptual audio coders: What to listen for,” Audio Engineering Society, Inc. 2002.
[55] S. Huang, T. Tsai, L. Chen, “Memory reduction technique of spreading function in MPEG AAC encoder”, in Proc. of the 7th Int. Conference on Digital Audio Effects (DAFX-04), Naples, Italy, October 5-8, 2004.
[56] J. D. Johnston, “Transform coding of audio signals using perceptual noise criteria,” IEEE Journal on Selected Areas in Communications, vol. 6, no 2, pp. 314-323, Feb., 1988.
[57] SQAM - Sound Quality Assessment Material. Available: http://www.tnt.uni-hannover.de/project/mpeg/audio/sqam/
[58] ITU-R Recommendation BS. 1387-1: “Method for objective measurements of perceived audio quality,” 2001.
[59] EAQUAL. [Online]. Available: http://www.mp3-tech.org/programmer/ sources/eaqual.tgz
[60] J. M. Rabary and M. Pedram, Low power design methodologies. Kluwer Academic Publisher Press, 1996, pp.12-15, p.345.
[61] MPEG-2 Audio Technical Report Software (ISO/IEC IS 13818-5). [Online]. http://www.tnt.uni-hannover.de/project/mpeg/audio/ftp/
[62] LAME v3.93.1, ”LAME ain’t an MP3 encoder.” Available: http://www.mp3dev.org, Dec. 1, 2002.
[63] O. Shimada, et al. “A low power SBR algorithm for the MPEG-4 audio standard and its DSP implementation,” Convention Paper 6048 in AES 116th Convention, May 2004.
[64] K. R. Rao and P. C. Yip, The transform and data compression handbook, Boca Raton, CRC Press LLC, 2001.
[65] C. W. Kok, “Fast algorithm for computing discrete cosine transform,” IEEE Trans. Signal Processing, v. 45, no. 3, pp. 757-760, Mar. 1997.
[66] K. R. Rao., P. Yip, Discrete Cosine Transform – Algorithms, advantages, applications, Academic Press, Inc. 1990. pp. 61-62.
[67] V. Britanak, “On the discrete cosine transform computation,” Signal Processing 40, 1994, pp. 183-194.
[68] K. Konstantinides, “Fast subband filtering in MPEG audio coding,” IEEE Signal Processing Letters, vol. 1, no.2, pp. 26-28, Feb. 1994.
[69] V. Britanak and K. R. Rao, “An efficient implementation of the forward and inverse MDCT in MPEG audio coding,” IEEE Signal Processing Letters, vol. 8, no. 2, pp. 48-51, Feb. 2001.
[70] C. M. Liu and W. C. Lee, “A unified fast algorithm for cosine modulated filter banks in current audio coding standards,” Journal of the Audio Engineering Society, vol. 47, no. 12, pp. 1061-1075, Dec. 1999
[71] Menno. (2003, July). “sbr_qmf.c” v 1.5 in FAAD2 2.0. [Online]. Available: http://cvs.sourceforge.net/viewcvs.py/faac/faad2/libfaad/
[72] H. Purnhagen, “Low complexity parametric stereo coding in MPEG-4,” Proc. of the 7th Int. Conference on Digital Audio Effects (DAFx ’04), Naples, Italy, October 5-8, 2004.
[73] ISO/IEC JTC 1/SC 29/WG 11, “Study on ISO/IEC 14496-3:2001/FPDAM5 (Scalable Lossless Coding),” ISO/IEC JTC 1/SC 29/WG 11 N7135, April 2005, Busan, Korea.
[74] ISO/IEC JTC 1/SC 29/WG 11, “Text of Working Draft for Spatial Audio Coding (SAC),” ISO/IEC JTC 1/SC 29/WG 11 N7136, April 2005, Busan, Korea.


Publication
International Journals
Shih-Way Huang, Tsung-Han Tsai and Liang-Gee Chen, “Fast Decomposition of Filterbanks for the State-of-the-art Audio Coding,” IEEE Signal Processing Letters. (Accepted in May 2005)
Shih-Way Huang, Tsung-Han Tsai and Liang-Gee Chen, “A Low Complexity Design of Psycho-Acoustic Model for MPEG-2/4 Advanced Audio Coding,” IEEE Transactions on Consumer Electronics, vol. 50, no. 4, pp.1209-1217, Nov. 2004.

International Conferences
Shih-Way Huang, Tsung-Han Tsai and Liang-Gee Chen, “Fast Filterbanks for the Low Power MPEG High Efficiency Advanced Audio Coding Decoder,” Convention paper 6336 in the Audio Engineering Society (AES) 118th Convention, May 28-31, 2005.
Shih-Way Huang, Liang-Gee Chen and Tsung-Han Tsai, “Memory and Computationally Efficient Psychoacoustic Model for MPEG AAC on 16-bit Fixed-point Processors,” in Proc. of the 2005 International Symposium on Circuits and Systems (ISCAS ’05), May 23-26, 2004, pp. 3155-3158.
Shih-Way Huang, Liang-Gee Chen and Tsung-Han Tsai, “Memory reduction technique of spreading function in MPEG AAC encoder,” in Proc. of the 7th Int. Conference on Digital Audio Effects (DAFx’04), Naples, Italy, October 5-8, 2004, pp. 331-334.
Tsung-Han Tsai, Yi-Wen Wang and Shih-Way Huang, “An MDCT-based psychoacoustic model co-processor design for MPEG-2/4 AAC audio encoder,” in Proc. of the 7th Int. Conference on Digital Audio Effects (DAFx’04), Naples, Italy, October 5-8, 2004, pp. 335-338.
Tsung-Han Tsai, Shih-Way Huang and Yi-Wen Wang, “Architecture design of MDCT-based psychoacoustic model co-processor in MPEG Advanced Audio Coding,” in Proc. of the 2004 International Symposium on Circuits and Systems (ISCAS ’04), vol. 2, May 23-26, 2004, pp. 761-764.
Tsung-Han Tsai, Shih-Way Huang and Liang-Gee Chen,“Design of a low power psycho-acoustic model co-processor for MPEG-2/4 AAC LC stereo encoder,” in Proc. of the 2003 International Symposium on Circuits and Systems (ISCAS ’03), vol. 2, May 25-28, 2003, pp. 552-555.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔