( 您好!臺灣時間:2021/04/19 00:38
字體大小: 字級放大   字級縮小   預設字形  


研究生(外文):Chen Jian Ming
論文名稱(外文):A Classification Approach for Video Text Detection
指導教授(外文):Wang Yuan Kai
外文關鍵詞:Wavelet transformspatial and temporal domainBayes classifier
  • 被引用被引用:1
  • 點閱點閱:446
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:25
  • 收藏至我的研究室書目清單書目收藏:0
對於現今多媒體資料量大幅度的增長,有效率的管理這龐大資料量是有其必要性的,而文字偵測(text detection) 在多媒體存取應用中,佔了很重要的地位。雖然有很多方式可以為影片加入註解(video annotation)以便進階的搜尋與存取,文字是最常出現在畫面中的物件。欲完成此對影片加入註解的前處理,我們提出一個方法,來偵測影片中出現的文字。我們利用多張影像的資訊,來降低只使用單張影像時會發生的錯誤情形。先對整張影像使用三維小波轉換(3D wavelet transform),也等於完成邊緣偵測(edge detection)的程序。藉由將某些頻帶組合起來,強化影像中的邊緣,也就是邊緣強化(edge enhancement)而形成特徵圖(salience map)。在特徵圖上萃取適當的特徵後(feature extraction),我們將特徵輸入到貝氏辨認器(Bayes classifier)中進行辨認,再經由一些後處理步驟,最後輸出標示文字區域的影像。實驗部分,影片是從網路上下載或是電視上轉錄下來的,包含廣告、新聞、戲劇…等不同類型的影片加以分析,並且探討不同參數與小波函數基底所帶來的影響。
Due to the enormous growth of multimedia data, efficient management of data tends to be a topic in great demand. Text detection plays a critical role in multimedia access and retrieving. However, texts are the objects that appear in videos most. To finish this task, we propose a fine approach to detect texts in videos. Utilizing information of several frames reduces the occurrence of false alarms resulting from using information of single frame. A 3D wavelet transform is applied on input images first as edge detection. Later, edges are enhanced with combination of sub-bands decomposed from the input images, and salience maps then come into being. Extracted from the salience maps, features constitute a feature vector, the input to Bayes classifier and it outputs images with detect texts In experimental results, video clips are downloaded from Internet websites or recorded from TV of diverse types, including TV commercials, news, drama...etc. Besides, we will analyze the impact of different wavelet bases and parameters.
中文摘要 i
英文摘要 ii
誌 謝 iii
目 錄 iv
表目錄 vi
圖目錄 vii
第一章 導論 1
1.1 研究動機 1
1.2 研究目的 1
1.3 相關研究 2
第二章 系統架構 5
2.1 演算法介紹 6
2.2 邊緣偵測與影像強化 7
2.3 特徵萃取 8
2.4 樣本挑選與分類 9
第三章 邊緣偵測與影像強化 10
3.1 小波轉換 10
3.2 三維小波轉換 14
3.3 影像強化 19
第四章 特徵萃取與樣本篩選 26
4.1 特徵萃取 26
4.1.1 Mean 27
4.1.2 GLCM 28
4.2 樣本篩選 32
第五章 辨識 36
5.1貝氏辨認器 36
5.2實作細節 38
5.3影像金字塔 42
第六章 實驗結果 45
6.1小波轉換基底的比較 47
6.2文字與背景類別的機率值 48
6.3各種類型文字的偵測情形 51
6.4討論 56
第七章 結論 58
參考文獻 59
[1] R. Lienhart and F. Stuber, “Automatic text recognition in digital videos,” Proceedings of SPIE in Image and Video Processing IV, Vol. 2666, Page(s): 180-188, March 1996.
[2] O. Hori, K. Toshiba-cho and S. K. Kawasaki, “A video text extraction method for character recognition,” International Conference on Document Analysis and Recognition, Page(s): 25-28, September 1999.
[3] X. S. Hua, X. R. Chen, L. Wenyin and H. J. Zhang, “Automatic location of text in video frames,” International Workshop on Multimedia Information Retrieval, Page(s): 24-27, October 2001.
[4] R. Lienhart and A. Wernicke, “Localizing and segmenting text in images and videos,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 4, Page(s): 256-268, April 2002.
[5] D. Chen, J. M. Odobez and H. Bourlard, “Text detection and recognition in images and video frames,” Pattern Recognition, Vol. 37, No. 3, Page(s): 595-608, March 2004.
[6] H. Li, D. Doermann and O. Kia, “Automatic text detection and tracking in digital video,” IEEE Transactions on Image Processing, Vol. 9, No. 1, Page(s): 147-156, January 2000.
[7] H. Li and D. Doermann, “A video text detection system based on automated training,” International Conference on Pattern Recognition, Vol. 2, Page(s): 223-226, September 2000.
[8] S. Deng, S. Latifi and E. Regentova, “Document segmentation using polynomial spline wavelets,” Pattern Recognition, Vol. 34, No. 12, Page(s): 2533-2545, December 2001.
[9] X. S. Hua, P. Yin and H. J. Zhang, “Efficient video text recognition using multiple frame integration,” IEEE International Conference on Image Processing, September 2002.
[10] E. K. Wong and M. Chen, “A new robust algorithm for video text extraction,” Pattern Recognition, Vol. 36, No. 6, Page(s): 1397-1406, June 2003.
[11] J. Xi, X. S. Hua, X. R. Chen, L. Wenyin and H. J. Zhang, “A video text detection and recognition system,” IEEE International Conference on Multimedia and Expo, Page(s): 1080-1083, August 2001.
[12] V. Wu, R. Manmatha and E. M. Riseman, “TextFinder: An automatic system to detect and recognize text in images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 11, Page(s): 1224-1229, November 1999.
[13] T. Mita and O. Hori, “Improvement of video text recognition by character selection,” International Conference on Document Analysis and Recognition, Page(s): 1089-1093, 2001.
[14] K. Sobottka, H. Bunke and H. Kronenberg, “Identification of text on colored book and journal covers,” International Conference on Document Analysis and Recognition, Page(s): 57-63, 1999.
[15] Y. Zhong, K. Karu and A. K. Jain, “Locating text in complex color images,” Pattern Recognition, Vol. 10, No. 28, Page(s): 1523-1536, 1995.
[16] H. Kamada and K. Fujimoto, “High-speed, high-accuracy binarization method for recognizing text in images of low spatial resolutions,” International Conference on Document Analysis and Recognition, Page(s): 139-142, 1999.
[17] T. Sato, T. Kanede, E. K. Hughes, M. Smith and S. Satoh, “Video OCR for digital news archive,” Proceedings of International IEEE Workshop on Content-Based Access of Image and Video Database, Page(s): 52-60, 1998.
[18] J. Li and R. M. Gray, “Text and picture segmentation by the distribution analysis of wavelet coefficients,” International Conference on Image Processing, Vol. 3, Page(s): 790-794, October 1998.
[19] M. Acharyya and M. K. Kundu, “Document image segmentation using wavelet scale — space features,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 12, December 2002.
[20] T. J. Burns, S. K. Rogers, M. E. Oxley and D. W. Ruck, “A wavelet multiresolution analysis for spatio-temporal signals,” IEEE Transactions on Aerospace and Electronic Systems, Vol. 32, No. 2, Page(s): 628-649, April 1996.
[21] M. Partio, B. Cramariuc and M. Gabbouj, A. Visa, “Rock texture retrieval using Gray Level Co-occurrence Matrix”, Proceedings of 5th Nordic Signal Processing Symposium, October 2002.
[22] R. F. Walker, P. T. Jackway and D. Longstaff, “Genetic algorithm optimization of adaptive multi-scale GLCM features,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 17, No. 1, Page(s): 17-39, 2003.
[23] V. Khanna, P. Gupta and C. J. Hwang, “Finding connected components in digital images by aggressive reuse of labels,” Image and Vision Computing Journal, Vol. 20, No. 8, Page(s): 557-568, June 2002.
[24] A. P. Dhawan, S. Loncaric and S. Ramachandran, “Optimal multiresolution morphological image processing,” International Conference on Systems Engineering, Page(s): 170-173, August 1991.
[25] K. Jung, K. I. Kim and A. K. Jain, “Text information extraction in images and video: a survey,” Pattern Recognition, Vol. 37, No. 5, Page(s): 977-997, May 2004.
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔