跳到主要內容

臺灣博碩士論文加值系統

(44.210.99.209) 您好!臺灣時間:2024/04/15 14:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳俊綱
研究生(外文):Chun-Kang Chen
論文名稱:高階語意影片檢索系統之特徵選擇方法
論文名稱(外文):A Feature Selection Technique for Semantic Video Indexing System
指導教授:陳文進陳文進引用關係
指導教授(外文):Wen-Chin Chen
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2008
畢業學年度:96
語文別:英文
論文頁數:41
中文關鍵詞:特徵選擇影片搜尋TRECVID語意搜尋
外文關鍵詞:feature selectionvideo searchTRECVIDsemantic concept
相關次數:
  • 被引用被引用:0
  • 點閱點閱:210
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著現在越來越容易取得各式各樣的影片,人們也意識到要從龐大的影片中找尋特定的影片是越來越困難的事情,為了要能夠解決這項需求,針對高階語意的影片搜尋也變成影片搜尋中的主流。高階語意所指的是一些日常生活中的用語,對於使用者來說這不似低階特徵般的不熟悉,為了發展高階語意搜尋,TRECVID每年提供數百小時的影片以及公平的評分方式,因此成為了影片搜尋中的標準。有許多參與TRECVID的單位以低階特徵為基礎建出影片搜尋系統。電腦視覺的高度發展,使得越來越多具有辨識能力的低階特徵被發現出來,然而使用大量的低階特徵來訓練高階語意辨識器將會耗費極大量的時間,因此,如何有效率的使用這些低階特徵將變的越來越重要。本論文提供的特徵選擇方法,可以大幅度的減少訓練的時間,並且只會造成少量的準確度下降,甚至在我們僅使用一半的低階特徵時,可以達到98.88\%的準確度,並讓訓練時間減少了原本的36.07\%。
For processing the growing and easily accessing videos, users desire an automatic video search system by semantic queries, such as objects, scenes, and events from daily life. To this end TRECVID supplies sufficient video data and a fair evaluation method, annually, to progress video search techniques. Many participants build their classification through fusing results from modeling low level features (LLFs), such as color, edge, and so on. With the development of computer vision, more and more useful LLFs are designed. However, modeling all acquirable LLFs requires tremendous amount of time. Hence, how to use these LLFs efficiently has become an important issue. In this thesis, we propose an evaluation technique for LLFs, then the most appropriate concept-dependent LLF combinations can be chosen to reduce the modeling time while still keep reasonable video search precisions. In our experiments, only modeling 5 chosen LLFs out of total 16 LLFs can reduce 3.51\% modeling time with only 6.78\% performance drop. However, if a half number of LLFs are used, we can even keep 98.88\% precision with 36.07\% time saving.
Acknowledgments i
Abstract iv
List of Figures viii
List of Tables x
Chapter 1 Introduction 1
Chapter 2 RelatedWork 4
2.1 Simple Pipeline Framework . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Sampling Negative Data . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Multi-representation . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 3 Feature Selection Technique 8
3.1 Feature Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Evaluate Feature Components . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Random Forests . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 F-score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Mutual Information . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Dependency Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Sort LLFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Set a Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 4 Framework 18
4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Grid Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.2 Bagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.1 Late Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Chapter 5 Expermental Results 25
5.1 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.1 Shots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1.2 Evaluatuion Tool . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Feature Extractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Selected LLFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.4 Compare with 16 LLFs Baseline . . . . . . . . . . . . . . . . . . . . 32
5.5 Compare with Control Groups . . . . . . . . . . . . . . . . . . . . . 34
5.6 Compare with Other Time Saving Methods . . . . . . . . . . . . . . 35
5.7 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Chapter 6 Conclusions and Future Works 38
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Bibliography 40
Appendix A High Level Features in TRECVID 2008 43
[1] A. Al-Ani and M. Deriche. Feature selection sing a mutual information based measure. In Proc. of the 16 th International Conference on Pattern Recognition Volume 4, page 40082, Washington, DC, USA, 2002.
[2] A. Amir, J. Argillander, M. Campbell, A. Haubold, S. Ebadollahi, F. Kang, M. Naphade, A. Natsev, J. R. Smith, J. Tei, and T. Volkmer. IBM research TRECVID-2005 video retrieval system. In NIST TRECVID-2005 Workshop, Gaithersburg, MD, November 2005.
[3] R. Battiti. Using mutual information for selecting features in supervisedneural net learning. 5:537–550, Jul 1994.
[4] L. Breiman. Random forests. Machine Learning, 45(1):5–32, 10 2001.
[5] M. Campbell, A. Haubold, M. Liu, A. P. Natsev, J. R. Smith, J. Tesic, L. Xie, R. Yan, and J. Yang. Ibm research trecvid-2007 video retrieval system. In NIST TRECVID-2007
Workshop, Gaithersburg, MD, 2007.
[6] J. Cao, Y. Lan, J. Li, Q. Li, X. Li, F. Lin, X. Liu, L. Luo, W. Peng, D. Wang, H. Wang, Z. Wang, Z. Xiang, J. Yuan, W. Zheng, B. Zhang, J. Zhang, L. Zhang, and X. Zhang. Intelligent multimedia group of Tsinghua University at TRECVID 2006. In NIST TRECVID-2006 Workshop, Gaithersburg, MD, 2006.
[7] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.
[8] Y.-W. Chen and C.-J. Lin. Combining SVMs with various feature selection strategies. In Feature extraction, foundations and applications. Springer, 2006.
[9] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1, pages 886–893, Washington, DC, USA, 2005.
[10] J. Huang, S. R. Kumar, M. Mitra, and W.-J. Zhu. Spatial color indexing and applications. In ICCV, pages 602–607, 1998.
[11] L. Kennedy and S.-F. Chang. A Reranking Approach for Context-based Concept Fusion in Video Indexing and Retrieval. In ACM International Conference on Image and Video Retrieval, Amsterdam, Netherlands, July 2007.
[12] Z. Lei, L. Fuzong, and Z. Bo. A cbir method based on color-spatial feature. Proceedings of the IEEE Region 10 Conference, 1:166–169 vol.1, 1999.
[13] B. Manjunath and W. Ma. Texture features for browsing and retrieval of image data. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 18(8):837–842, Aug
1996.
[14] G. Pass, R. Zabih, and J. Miller. Comparing images using color coherence vectors. In Proc. of the fourth ACM international conference on Multimedia, pages 65–73, 1996.
[15] H. Peng, F. Long, and C. Ding. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. In IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 27, pages 1226–1238, 2005.
[16] J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, 1999.
[17] G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In Proc. of the 15th international conference on Multimedia, pages 17–26, New York, NY, USA, 2007.
[18] A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and trecvid. In Proc. of the 8th ACM International Workshop on Multimedia Information Retrieval, pages 321–330, 2006.
[19] J. R. Smith and S.-F. Chang. Visualseek: A fully automated content-based image query system. In ACM Multimedia, pages 87–98, 1996.
[20] C. G. M. Snoek, M. Worring, and A. W. M. Smeulders. Early versus late fusion in semantic video analysis. In Proc. of the 13th annual ACM international conference on
Multimedia, pages 399–402, 2005.
[21] C. G. M. Snoek, M. Worring, J. C. van Gemert, J.-M. Geusebroek, and A.W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proc. of the 14th annual ACM international conference on Multimedia, pages 421–430, 2006.
[22] M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin. The NTU toolkit and framework for high-level feature detection
at TRECVID 2007. In Proc. of NIST TREC Video Retrieval Evaluation Workshop 2007, November 2007.
[23] T.-L. Wu and S.-K. Jeng. Probabilistic estimation of a novel music emotion model. In Proc. of the 14th International Multimedia Modeling Conference, volume 4903, pages 487–497, 2008.
[24] A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia University’s baseline detectors for 374 LSCOMsemantic visual concepts. Technical report, Columbia University, March 2007.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top