(3.235.191.87) 您好!臺灣時間:2021/05/13 13:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳亭諭
研究生(外文):Ting-Yu Chen
論文名稱:視訊資料庫中視訊和音訊的知識結構
論文名稱(外文):4D C-String: A New Audio-visual Knowledge Structure and Similarity Retrieval for Video Database Systems
指導教授:李瑞庭李瑞庭引用關係
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊管理學研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:英文
論文頁數:43
中文關鍵詞:3D C-string4D C-string視訊資料庫知識結構
外文關鍵詞:Knowledge structureVideo database3D C-string4D C-string
相關次數:
  • 被引用被引用:0
  • 點閱點閱:165
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
過去李瑞庭博士提出一種稱作3D C-string的知識結構,將視訊中物件之間的空間與時間關係利用字串的方式記錄下來,以便於搜尋、管理與顯現視訊資料庫中的視訊物件,可是由於視訊除了包含視訊資料以外,也包含了豐富的音訊資料,如果能增加音訊方面的知識結構,將會提高搜尋的準確性。因此我們在本篇論文中以3D C-string 的觀念為基礎提出了一種新的視訊和音訊知識結構來進行視訊資料庫中的搜尋,命名為4D C-string。4D C-string的視覺部分,我們利用視訊物件的投射來表示一個視訊中物件間的空間與時間關係,並且記錄物件移動的軌跡與大小變化,從音訊部分我們萃取三種音訊特徵以形成音訊字串,包括安靜/非安靜、音樂/演講、歌手/演講者辨識。我們也定義了相似度的測量並提出了結合音訊和視訊相似度比對的演算法。最後,我們會藉由一些實驗結果驗證我們所提出演算法的效率。
This paper presents a new audio-visual knowledge structure and similarity for video database systems, called 4D C-string. It is based on the 3D C-string, which is a knowledge structure that can express visual characteristic of objects in a video but it does not consider the audio part of videos. So we add audio dimension on it to make the retrieval results more precise. For the visual part, we can generate strings to represent the spatial and temporal relations between the objects in a video and their motions and size changes. For the audio part, we can generate three audio strings. Then we propose the similarity retrieval algorithm based on the visual and audio information to retrieve the similar videos from the database for a given query video. Our proposed method this approach can provide user an easy and efficient way to retrieve, visualize and manipulate video and audio objects in video database systems.
List of Figures ii
List of Tables iii
Chapter 1 Introduction 1
Chapter 2 Literature Survey 3
2.1 2D and 3D C-string 3
2.2 MPEG audio 10
2.3 Discussion 12
Chapter 3 4D C-String 14
3.1 The audio string algorithm 14
3.2 KMP string matching algorithm 20
3.3 Similarity retrieval algorithm 23
Chapter 4 Performance Analysis 30
4.1 Video and audio index generation and query types 31
4.2 Synthesized videos 32
4.3 Real video and audio 35
Chapter 5 Concluding Remarks and Future Work 40
References 41
中文部分 41
英文部分 42
中文部分
[1] 吳家麟,古美君,MPEG-1 音編解碼器之研究與其即時軟體之實作
碩士論文,民國85年。
[2] 陳良基,陳昭和,蔡宗漢,MPEG 音頻訊號解碼器之晶片設計,碩博士論文,民國83年。
[3] 黃群菘,劉志俊,MP3數位音樂資料的自動分類,國科會補助之研究成果(計劃編號 NSC 90-2213-E-216-010), 民國93年。
[4] 蔡易行,陳建發,哼唱之旋律之MP3資料庫檢索系統,http://datf.iis.sinica.edu.tw/Announcement/papers.jsp
















英文部分
[5]S.K. Chang, Q.Y. Shi and C.W. Yan, Iconic indexing by 2D strings, IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 9, 1987, pp. 413-429.
[6]S.K. Chang, E. Jungert and Y. Li, Representation and retrieval of symbolic pictures using generalized 2D strings, Technical Report, University of Pittsburgh, 1988.
[7]P.W. Huang and Y.R. Jean, Using 2D C+-string as spatial knowledge representation for image database systems, Pattern Recognition, vol. 27, 1994, pp. 1249-1257.
[8]P.W. Huang and Y.R. Jean, Spatial reasoning and similarity retrieval for image database systems based on RS-strings, Pattern Recognition, vol. 29, 1996, pp.2103-2114.
[9]S.Y. Lee and F.J. Hsu, 2D C-string: a new spatial knowledge representation for image database system, Pattern Recognition, vol. 23, 1990, pp. 1077-1087.
[10]S.Y. Lee and F.J. Hsu, Spatial reasoning and similarity retrieval of images using 2D C-string knowledge representation, Pattern Recognition, vol. 25, 1992, pp.305-318.
[11]S.Y. Lee and F.J. Hsu, Picture algebra for spatial reasoning of iconic images represented in 2D C-string, Pattern Recognition Letter, vol. 12, 1991, pp.425-435.
[12]S.Y. Lee and F.J. Hsu, Spatial reasoning and similarity retrieval of images using 2D C-string knowledge representation, Pattern Recognition, vol 25, 1992, pp.305-318.
[13] Ye-In Chang, Hsing-Yen Ann, Wei-Horng Yeh, a unique-ID-based matrix strategy for efficient iconic indexing of symbolic pictures, Pattern Recognition,vol. 33, 2000, pp.263-1276.
[14] Anthony J.T. Lee, and H.P.Chiu, 2D Z-string: a new spatial knowledge representation for image databases, Pattern Recognition Letter, vol. 24, 2003, pp.3015-3026.
[15] Anthony J.T. Lee, and H.P.Chiu, 3D C-string as: A new spatio-temporal knowledge structure for video database systems, Pattern Recognition, vol. 35, no.11, 2002, pp.2521-2537.
[16] Anthony J.T. Lee, and H.P.Chiu, Similarity retrieval of videos by using 3D C-string knowledge representation, Pattern Recognition, submitted.
[17] Chih-Chin Liu,Chuan-Sung Huang, A singer identification Technique for Content-Based Classification of MP3 Music Objects, ACM International Conference on Information and Knowledge Management, 2002.
[18] Chih-Chin Liu,Jia-Lien Hsu and Arbee L.P.Chen, An approximate string matching algorithm for content-based music data retrieval, In Proc. of IEEE International Conference on Data Engineering,1999.
[19] Yasuyuki Nakajima,Yang Lu,Masaru Sugano,Akio Yoneyama,Hiromasa Yanagihara, and Akira Kurematsu, A fast audio classification from MPEG coded data, Proc. of International Conference on Acoustic, Speech and Signal Processing (ICASSP), 1999,.
[20] D. Pan, A tutorial on MPEG/audio compression, IEEE Multimedia,vol. 2, no. 2, 1995, pp. 60–74.
[21] Pierce, J. Robinson, The science of musical sound, W.H. Freeman Company, 1992.
[22] ISO/IEC International Standard IS 11172-3, Information technology - coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s - Part 3: Audio.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔