跳到主要內容

臺灣博碩士論文加值系統

(3.235.227.117) 您好!臺灣時間:2021/07/28 04:04
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳維凱
研究生(外文):Wei-Kai Chen
論文名稱:基於故事單元與人臉偵測之影片摘要
論文名稱(外文):Video Summarization Based on Story-Unit and Face Detection
指導教授:蔡鴻旭蔡鴻旭引用關係
指導教授(外文):Hung-Hsu Tsai
學位類別:碩士
校院名稱:國立虎尾科技大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:106
中文關鍵詞:分鏡變換偵測擷取關鍵影像分鏡合併故事單元人臉偵測影片摘要
外文關鍵詞:Video SummarizationShot change detectionKey-frame extractionShot groupFace detectionStory-unit
相關次數:
  • 被引用被引用:0
  • 點閱點閱:536
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文基於故事單元與人臉偵測擷取影片摘要,擷取影片摘要的流程分為以下五個步驟,分鏡變換偵測、擷取關鍵影像、分鏡合併、人臉偵測及影片故事單元,首先使用HSV色彩空間特徵執行分鏡變換偵測、擷取關鍵影像及分鏡合併,再使用haar-like-feature進行人臉偵測並利用SVM分類器識別非人臉影像,利用time-adaptive grouping分群法將相似的影片分鏡分成同一群組,最後將分群結果結合時間資訊與最短路徑演算法產出影片故事單元,可提供摘要比率較低的影片故事主要單元,或摘要比率較高的完整影片摘要,因此本論文提出方法所找出來之影片摘要具有影片中出現的人物及影片完整故事結構的特性;實驗部分採用Open Video Project蒐集的影片及其他不同類型影片作為測試影片,最後將本論文影片摘要與其他方法產出結果,利用問卷收集來評估效能。
The thesis presents a video summarization technique, based on story-unit and face detection, and clustering. The technique consists of five components, shot change detection, key-frame extraction, shot group, face detection, and story-unit identification. The first three components are realized in HSV domain. Moreover, the shot-group component employs the backward-shot-coherence technique to reduce the redundant key frames. It also employs face detection using haar-like-feature and SVM to get more video information. Subsequently, a clustering algorithm, time-adaptive grouping, can find story units. Furthermore, a set of story units can be simplified using Dijkstra’s algorithm. Therefore, video summarization with low frame rate can be obtained. Finally, experimental results show that the proposed technique outperforms other existing methods, Open Video Project and Delaunay-clustering-based method.
摘要......................................................i
Abstract.................................................ii
致謝....................................................iii
目錄.....................................................iv
表目錄...................................................vi
圖目錄..................................................vii
一、 緒論.............................................1
1.1 研究動機.........................................1
1.2 研究目的.........................................2
1.3 相關研究.........................................4
1.4 論文架構........................................12
二、 相關技術回顧....................................13
2.1 影片結構........................................13
2.2 分鏡變換偵測....................................20
2.3 擷取關鍵影像....................................31
2.4 分鏡合併........................................37
2.5 HSV色彩空間.....................................41
2.6 離散小波轉換....................................44
2.7 人臉偵測與辨識..................................47
2.8 分群法..........................................52
2.8.1 K-means.........................................53
2.8.2 X-means.........................................54
2.8.3 Delaunay clustering.............................55
2.9 支援向量機......................................57
2.9.1 線性可分支援向量機 (Linear separable SVM).......59
2.9.2 線性不可分支援向量機(Linear non-separable SVM)..61
2.9.3 非線性支援向量機 (Non-linear SVM)...............63
三、 提出影片摘要方法................................65
3.1 研究架構........................................65
3.2 分鏡偵測........................................68
3.3 擷取關鍵影像....................................69
3.4 分鏡合併........................................70
3.5 故事單元偵測....................................71
3.6 人臉偵測........................................76
四、 實驗結果與分析..................................80
4.1 實驗環境........................................80
4.2 測試影片及評估方式..............................80
4.3 實驗結果........................................84
五、 結論與未來研究方向..............................99
5.1 結論............................................99
5.2 未來研究方向....................................99
參考文獻................................................101
[1]九十八年度台灣寬頻網路使用調查報告,http://www.twnic.net.tw/download/200307/200307index.shtml
[2]Akamai Technologies, "Akamai streaming - when performance matters," White Paper, 2004.
[3]Yahoo Video Search, http://search.yahoo.com
[4]Google Video, http://video.google.com
[5]Youtube Video Search, http://www.youtube.com
[6]P. Mundur, Y. Rao and Y. Yesha, "Keyframe-based video summarization using Delaunay clustering," International Journal on Digital Libraries, vol. 6, no. 2, pp. 219-232, Apr. 2006.
[7]D. Wu, S. Ci and H. Wang, "Cross-Layer optimization for video summary transmission over wireless networks," IEEE Journal on Selected Areas in Communications, vol. 25, no. 4, May 2007.
[8]M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele and P. Yanker, "Query by image and video content: the QBIC system," IEEE Computer Magazine, vol. 28, no. 9, pp. 23-32, Sept. 1995.
[9]A. Hampapur, A. Gupta, B. Horowitz, C. F. Shu, C. Fuller, J. Bach, M. Gorkani and R. Jain, "Virage video engine," Proceedings of the SPIE Storage and Retrieval for Video and Image Databases V, vol. 3022, pp. 188-198, Feb. 1997.
[10]A. Pentland, R. W. Picard and S. Sclaroff, "Photobook: content-based manipulation of image databases," International Journal of Computer Vision, vol. 18, pp. 233-254, Jun. 1996.
[11]J. R. Smith and S. F. Chang, "VisualSEEk: a fully automated content-based image query system," Proceedings of the fourth ACM international conference on Multimedia, pp. 87-98, Nov. 1996.
[12]S.-F. Chang, W. Chen, H. J. Meng, H. Sundaram and D. Zhong, "VideoQ: an automated content based video search system using visual cues," Proceedings of the fifth ACM international conference on Multimedia, pp. 313-324, 1997.
[13]C. Cotsaces, N. Nikolaidis and I. Pitas, "Video shot detection and condensed representation, a review," IEEE Signal Processing Magazine, vol. 23, pp. 28-37, Mar. 2006.
[14]R. M. Ford, C. Robson, D. Temple and M. Gerlach, "Metrics for shot boundary detection in digital video sequences," Multimedia Systems, vol. 8, pp. 37-46, Jan. 2000.
[15]H.-J. Zhang, A. Kankanhalli and S. W. Smoliar, "Automatic partitioning of full-motion video," Multimedia Systems, vol. 1, no. 1, pp. 10-28, Jun. 1993.
[16]Z. Rasheed and M. Shah, "Scene detection in hollywood movies and TV shows," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 343-348, Feb. 2003.
[17]B. L. Yeo and B. Liu, "A unified approach to temporal segmentation of motion JPEG and MPEG compressed video," Proceedings of the IEEE International Conference on Multimedia Computing and Systems, pp. 81-88, May 1995.
[18]A. Nagasaka and Y. Tanaka, "Automatic video indexing and full-video search for object appearances," Proceedings of IFIP 2nd Working Conference on Visual Database Systems, pp. 113-127, Jan. 1992.
[19]G. Lupatini, C. Saraceno and R. Leonardi, "Scene break detection: a comparison," Proceedings of Continuous-Media Database and Application, pp. 34-41, Feb. 1998.
[20]A. Velivelli and T. S. Huang, "Automatic video annotation by mining speech transcripts," Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 115-122, Jun. 2006.
[21]R. Zabih, J. Miller and K. Mai, "A feature-based algorithm for detecting and classifying scene breaks," Multimedia Systems, vol. 7, no. 1, pp. 119-128, January 1995.
[22]R. Zabih, J. Miller and K. Mai, "A feature-based algorithm for detecting and classifying production effects," Multimedia Systems, vol. 7, no. 2, pp. 119-128, Mar. 1999.
[23]Z. Cernekova, I. Pitas and C. Nikou, "Information theory-based shot cut/fade detection and video summarization," IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 1, pp. 82-91, Jan. 2006.
[24]H. Yi, D. Rajan and L.-T. Chia, "A motion-based scene tree for compressed video content management," Image and Vision Computing, vol. 24, no. 2, pp. 131-142, Feb. 2006.
[25]V. Chasanis, A. Likas and N. Galatsanos, "Simultaneous detection of abrupt cuts and dissolves in videos using support vector machines," Pattern Recognition Letters, vol. 30, no. 1, pp. 55-69, Jan. 2009.
[26]J. Lee and B. W. Dickinson, "Hierarchical video indexing and retrieval for subband-coded video," IEEE Transactions on Circuits and Systems for Video Technology, vol. 10, no. 5, pp. 824-829, Aug. 2000.
[27]Y. Rui, T. S. Huang and S. Mehrotra, "Constructing table-of-content for videos," Multimedia Systems, vol. 7, no. 5, pp. 359-368, Sept. 1999.
[28]G. Ciocca and R. Schettini, "Dynamic key-frame extraction for video summarization," Proceedings of the SPIE Internet Imaging VI, vol. 5670, pp. 137-142, Jan. 2005.
[29]Y. Zhang, Y. Rui, T. S. Huang and S. Mehrotra, "Adaptive key frame extraction using unsupervised clustering," Proceedings of the IEEE International Conference on Image Processing, vol. 1, pp. 866-870, Oct. 1998.
[30]A. Divakaran, R. Radhakrishnan and K. Peker, "Motion activity-based extraction of key-frame from video shots," Proceedings of the IEEE International Conference on Image Processing, vol. 1, pp. 932-935, Sept. 2002.
[31]T. Liu, H.-J. Zhang and F. Qi, "A novel video key-frame-extraction algorithm based on perceived motion energy model," IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 10, pp. 1006-1013, Oct. 2003.
[32]Y.-F. Ma and H.-J. Zhang, "A model of motion attention for video skimming," Proceedings of the IEEE International Conference on Image Processing, vol. 1, pp. 129-132, Sept. 2002.
[33]Y.-F. Ma, L. Lu, H.-J. Zhang and M. Li, "A user attention model for video summarization," Proceedings of the 10th ACM international conference on Multimedia, pp. 533-542, Dec. 2002.
[34]C.-W. Ngo, Y.-F. Ma and H.-J. Zhang, "Video summarization and scene detection by graph modeling," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 296-305, Feb. 2005.
[35]M. Yeung and B.-L. Yeoz, "Segmentation of video by clustering and graph analysis," Computer Vision and Image Understanding, vol. 71, no. 1, pp. 94-109, Jul. 1998.
[36]M. M. Yeung, B.-L. Yeo, W. Wolf and B. Liu, "Video browsing using clustering and scene transitions on compressed sequences," Proceedings of the SPIE Multimedia Computing and Networking, vol. 2417, pp. 399-413, Feb. 1995.
[37]H.B. Kang, "A hierarchical approach to scene segmentation," Content-Based Access of Image and Video Libraries, pp. 65-71, 2001.
[38]J. MacQueen, "Some methods for classification and analysis of multivariate observations," Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281-297, 1967.
[39]J. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[40]B.-W. Chen, J.-C. Wang and J.-F. Wang, "A novel video summarization based on mining the story-structure and semantic relations among concept entities," IEEE Transactions on Multimedia, vol. 11, no. 2, pp. 295-312, Feb. 2009.
[41]D. Pelleg and A. Moore, "X-means: extending k-means with efficient estimation of the number of clusters," Proceedings of the 17th International Conference Machine Learning, pp. 727-734, Jun. 2000.
[42]C.W. Ngo, T. C. Pong and H.-J. Zhang, "Motion-based video representation for scene change detection," International Journal of Computer Vision, vol. 50, no. 2, pp. 127-142, Nov. 2002.
[43]L.-H. Chen, Y.-C. Lai and H.-Y. Liao, "Video scene extraction using mosaic technique," Proceedings of International Conference on Pattern Recognition, pp. 723-726, 2006.
[44]M. Irani and P. Anandan, "Video indexing based on mosaic representations," Proceedings of the IEEE, vol. 85, no. 5, pp. 905-921, May 1998.
[45]R. A. Dwyer, "A faster divide and conquer algorithm for constructing Delaunay triangulations," Algorithmic, vol. 2, pp. 137-151, Nov. 1987.
[46]G. Ciocca and R. Schettini, "Supervised and unsupervised classification post-processing for visual video summaries," IEEE Transactions on Consumer Electronics, vol. 52, no. 2, May 2006.
[47]G. Ciocca, C. Cusano, R. Schettini and C. Brambilla, "Semantic labeling of digital photos by classification," Proceedings of the SPIE Internet Imaging IV, vol. 5018, pp. 296-303, Jun. 2003.
[48]L.-Y. Duan, M. Xu, Q. Tian, C.-H. Xu and J. S. Jin, "A unified framework for semantic shot classification in sports video," IEEE Transactions on Multimedia, vol. 7, no. 6, pp. 1066-1083, Dec. 2005.
[49]I. Otsuka, K. Nakane, A. Divakaran, K. Hatanaka and M. Ogawa, "A highlight scene detection and video summarization system using audio feature for a personal video recorder," IEEE Transactions on Consumer Electronics, vol. 51, no. 1, pp. 112-116, Feb. 2005.
[50]X. Zhu, X. Wu, A. K. Elmagarmid, Z. Feng and L. Wu, "Video data mining: semantic indexing and event detection from the association perspective," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 5, pp. 665-677, May 2005.
[51]C.-Y. Chen, J.-C. Wang and J.-F. Wang, "Efficient news video querying and browsing based on distributed news video servers," IEEE Transactions on Multimedia, vol. 8, no. 2, pp. 257-269, Apr. 2006.
[52]C. Xu, Y.-F. Zhang, G. Zhu, Y. Rui, H. Lu and Q. Huang, "Using webcast text for semantic event detection in broadcast sports video," IEEE Transactions on Multimedia, vol. 10, no. 7, pp. 1342-1355, Nov. 2008.
[53]J. Calic, D. P. Gibson and N. W. Campbell, "Efficient layout of comic-like video summaries," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 7, Jul. 2007.
[54]T. Liu and R. Katpelly, "An interactive system for video content exploration," IEEE Transactions on Consumer Electronics, vol. 52, no. 4, pp. 1368-1376, Nov. 2006.
[55]T. Mei, X.-S. Hua, C.-Z. Zhu, H.-Q. Zhou and S. Li, "Home video visual quality assessment with spatiotemporal factors," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 6, Jun. 2007.
[56]C. Choudary and T. Liu, "Summarization of visual content in instructional videos," IEEE Transactions on Multimedia, vol. 9, no. 7, Nov. 2007.
[57]Z. Zhou, X. Chen, Y.-C. Chung, Z. He, T. X. Han and J. M. Keller, "Activity analysis, summarization, and visualization," IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 11, Nov. 2008.
[58]D. Anderson, R. H. Luke, J. M. Keller, M. Skubic, M. Rantz and M. Aud, "Linguistic summarization of video for fall detection using voxel person and fuzzy logic," Computer Vision and Image Understanding, vol. 113, pp. 80-89, Jan. 2009.
[59]P. Turaga, R. Chellappa, V. S. Subrahmanian and O. Udrea, "Machine recognition of human activities: A Survey," IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 11, Nov. 2008.
[60]Z. Xiong, X. S. Zhou, Q. Tian, Y. Rui and H. Ts, "Semantic retrieval of video-review of research on video retrieval in meetings, movies and broadcast news, and sports," IEEE Signal Processing Magazine, vol. 23, pp. 18-27, Mar. 2006.
[61]J. Bescós, G. Cisneros, J. M. Martínez, J. M. Menéndez and J. Cabrera, "A unified model for techniques on video-shot transition detection," IEEE Transactions on Multimedia, vol. 7, no. 2, pp. 293-307, Apr. 2005.
[62]V. Kobla, D. DeMenthon and D. Doermann, "Special effect edit detection using Video Trails: a comparison with existing techniques," Proceedings of the SPIE Storage and Retrieval for Image and Video Databases VII, vol. 3656, pp. 302-313, Jan. 1999.
[63]C. Dalatsi, S. Krinidis, S. Tsekeridou and I. Pitas, "Use of support vector machines based on color and motion features for shot boundary detection," International Symposium on Telecommunications, Sept. 2001.
[64]J. Yuan, H. Wang, L. Xiao, W. Zheng, J. Li, F. Lin and B. Zhang, "A formal study of shot boundary detection," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 2, pp. 168-186, Feb. 2007.
[65]T. S. Subashini, V. Ramalingam and S. Palanivel, "Breast mass classification based on cytological patterns using RBFNN and SVM," Expert Systems with Applications, vol. 36, no. 3, pp. 5284-5290, Apr. 2009.
[66]MPEG-7 Overview, http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm
[67]B. S. Manjunath, J.-R. Ohm, V. V. Vasudevan and A. Yamada, "Color and Texture Descriptors," IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 703-715, Jun. 2001.
[68]K. Chan and C. Peng, Wavelets for sensing technologies, Artech House, 2003.
[69]C. Garcia, G. Zikos and G. Tziritas, "A wavelet-based framework for face recognition," Proceedings of 5th European Conference on Computer Vision, pp. 84-92, 1998.
[70]A. Amira and P. Farrell, "An Automatic Face Recognition System based on Wavelet Transforms," Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 6, pp. 6252-6255, May 2005.
[71]M. Bicego, U. Castellani and V. Murino, "Using Hidden Markov Models and Wavelets for Face Recognition," Proceedings of the IEEE 12th International Conference on Image Analysis and Processing, pp. 52-56, Setp. 2003.
[72]E. Hjelmås and B. K. Low, "Face Detection: A Survey," Computer Vision and Image Understanding, vol. 83, no. 3, pp. 236-274, Setp. 2001.
[73]Y.-T. Pai, S.-J. Ruan, M.-C. Shie and Y.-C. Liu, "A Simple and Accurate Color Face Detection Algorithm in Complex Background," Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 1545-1548, Jul. 2006.
[74]Open Computer Vision Library, http://sourceforge.net/projects/opencv/library/
[75]G. Bradski and A. Kaehler, Learning OpenCV, O’Reilly Media, 2008.
[76]P. Viola and M. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001.
[77]L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley and Sons, 1990.
[78]J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco, 2001.
[79]H.-H. Tsai and D.-W. Sun, "Color image watermark extraction based on support vector machines," Information Sciences, vol. 177, no. 2, pp. 550-569, Jan. 2007.
[80]ORL Database, http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
[81]Libsvm, http://www.csie.ntu.edu.tw/~cjlin/libsvm/
[82]Open Video Project, http://www.open-video.org/
[83]CNN Student News, http://rss.cnn.com/services/podcasting/studentnews/rss.xml
[84]S. Lu, I. King and M. R. Lyu, "A novel video summarization framework for document preparation and archival applications," Proceedings of the IEEE Aerospace Conference, pp. 1-10, Mar. 2005.
[85]J. You, G. Liu, L. Sun and H. Li, "A multiple visual models based perceptive analysis framework for multilevel video summarization," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 3, pp. 273-285, Mar. 2007.
[86]Y.-F. Ma, X.-S. Hua, L. Lu and H.-J. Zhang, "A Generic Framework of User Attention Model and Its Application in Video Summarization," IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 907-919, Oct. 2005.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top