跳到主要內容

臺灣博碩士論文加值系統

(44.211.31.134) 您好!臺灣時間:2024/07/22 19:19
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:徐敏
研究生(外文):HSU, MIN
論文名稱:以整合視聽覺特徵之深度學習模型於偵測影片精彩片段之研究
論文名稱(外文):Deep Learning Models for Video Highlight Detection by Integrating Visual and Auditory Features
指導教授:林敏勝林敏勝引用關係
指導教授(外文):LIN, MIN-SHENG
口試委員:李選士洪茂盛張明桑林敏勝
口試委員(外文):LEE, HSUAN-SHIHHORNG, MAW-SHENGCHANG, MING-SANGLIN, MIN-SHENG
口試日期:2020-07-08
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:中文
論文頁數:38
中文關鍵詞:深度學習圖形識別音訊處理影片摘要精華剪輯
外文關鍵詞:Deep LearningPattern RecognitionAudio Signal ProcessingVideo SummarizationHighlight Detection
相關次數:
  • 被引用被引用:0
  • 點閱點閱:321
  • 評分評分:
  • 下載下載:38
  • 收藏至我的研究室書目清單書目收藏:0
現今,每一分鐘就有超過500個小時的影片上傳至YouTube上,由於多媒體資料量急遽成長,讓機器能夠自動偵測影片中最有趣的精華片段,將會是越來越重要的課題。
本論文使用了深度學習等相關技術來偵測YouTube視頻遊戲串流影片中的精華片段。我們所提出的模型包括:2D卷積神經網路(2D_CNN)、3D卷積神經網路(3D_CNN)、2D卷積神經網路加上雙向LSTM(2D_CNN_LSTM)。
實驗結果顯示在準確率與F1 score的評比上,3D_CNN與2D_CNN_LSTM表現較優。此外,我們也針對視聽覺特徵的影響性進行評估,實驗結果顯示同時考慮視聽覺特徵的模型,其表現幾乎都優於只有考慮視覺影像特徵的模型與只有考慮聽覺音訊特徵的模型。

Nowadays, more than 500 hours of video are uploaded to YouTube per minute. As the volume of multimedia increases dramatically, it becomes increasingly important to automatically detect the highlights of a video for extracting the most interesting clips from it.
This thesis applies the deep learning method and other related technologies to detect the highlights of a video game streaming on YouTube. The proposed models include (1) 2D convolutional neural network (2D_CNN), (2) 3D convolutional neural network (3D_CNN), and (3) 2D convolutional neural network followed by bidirectional LSTM (2D_CNN_LSTM).
The experimental results show that both 3D_CNN and 2D_CNN_LSTM outperform in terms of accuracy and F1 score. The experiments also measure the contributions of visual and auditory features and show that the models integrating both visual and auditory features always outperform the models with only visual feature and the models with only auditory feature.

摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 前言 1
1.2 研究動機與目的 2
1.3 論文架構 2
第二章 文獻探討 3
2.1 深度學習 3
2.2 圖像處理 4
2.3 音訊處理 5
2.4 相關文獻探討 6
第三章 研究方法 7
3.1 資料蒐集 7
3.2 資料預處理 7
3.3 2D卷積神經網路架構 8
3.4 雙向LSTM 10
3.5 3D卷積神經網路 11
3.6 提前終止與模型檢查點 12
3.7 混淆矩陣 12
第四章 實驗結果與分析 14
4.1 實驗設置 14
4.2 模型訓練之實驗 15
4.2.1 2D卷積神經網路 15
4.2.2 3D卷積神經網路 20
4.2.3 2D卷積神經網路加雙向LSTM 27
4.3 實驗結果與比較 33
第五章 結論與未來展望 35
5.1 結論 35
5.2 未來展望 35
參考文獻 36

[1]Wikipedia:英雄聯盟. Retrieved June 18, 2020, from
https://zh.wikipedia.org/zh-tw/英雄聯盟
[2]Medium: DNN-深度神經網路. Retrieved June 18, 2020, from
https://medium.com/一人多工工作室/dnn-深度神經網路-cf892cbb06d5
[3]iT幫幫忙: 處理影像的利器 - 卷積神經網路(Convolutional Neural Network). Retrieved June 18, 2020, from
https://ithelp.ithome.com.tw/articles/10191820
[4]Medium: [資料分析&機器學習]卷積神經網絡介紹(Convolutional Neural Network). Retrieved June 18, 2020, from
https://medium.com/jameslearningnote/資料分析-機器學習-第5-1講-卷積神經網路介紹-convolutional-neural-network-4f8249d65d4f
[5]Mehmet Hacibeyoglu, "Human Gender Prediction on Facial Images Taken by Mobile Phone using Convolutional Neural Networks," International Journal of Intelligent Systems and Applications in Engineering, 2018.
[6]Introduction to Audio Signals. Retrieved June 18, 2020, from
http://mirlab.org/jang/books/audioSignalProcessing/audioIntro.asp?title=3-1%20Introduction%20to%20Audio%20Signals
[7]C. Lin and Y. Chen, "Sports Video Summarization with Limited Labeling Datasets Based on 3D Neural Networks," 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Taipei, Taiwan, 2019.
[8]L. Hsieh, C. Lee, T. Chiu and W. Hsu, "Live Semantic Sport Highlight Detection Based on Analyzing Tweets of Twitter," 2012 IEEE International Conference on Multimedia and Expo, Melbourne, VIC, 2012.
[9]J. Lee, J. Kim and H. Kim, "Music Emotion Classification Based on Music Highlight Detection," 2014 International Conference on Information Science & Applications (ICISA), Seoul, 2014.
[10]T. Chavan, V. Patil, P. Rokade and S. Dholay, "Superintendence Video Summarization," 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), Vellore, India, 2020.
[11]Shuwen Xiao, Zhou Zhao, Zijian Zhang, Xiaohui Yan, Min Yang, "Convolutional Hierarchical Attention Network for Query-Focused Video Summarization", In arXiv. arXiv:2002.03740 [cs.CV] 31 Jan 2020.
[12]YouTube: EpicSkillshot - LoL VOD Library. Retrieved June 18, 2020, from
https://www.youtube.com/channel/UCdOWyp25T0HDtjpnV2LpIyw
[13]YouTube: Onivia LEC, LCS, LCK, LPL Highlights. Retrieved June 18, 2020, from
https://www.youtube.com/channel/UCPhab209KEicqPJFAk9IZEA
[14]Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam. “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” In arXiv.1704.04861v1 [cs.CV] 17 Apr 2017.
[15]知呼: 輕量化網路ShuffleNet MobileNet v1/v2 解析 Understanding LSTM Networks. Retrieved June 18, 2020, from https://zhuanlan.zhihu.com/p/35405071
[16]colah’s blog: Understanding LSTM Networks. Retrieved June 18, 2020, from
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
[17]A. Mukherjee, S. Mukhopadhyay, P. K. Panigrahi and S. Goswami, "Utilization of Oversampling for multiclass sentiment analysis on Amazon Review Dataset," 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST), Morioka, Japan, 2019.
[18]Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri, "Learning Spatiotemporal Features with 3D Convolutional Networks," In arXiv:1412.0767 [cs.CV], 2 Dec 2014.
[19]Medium: Precision, Recall, F1-score簡單介紹. Retrieved June 18, 2020, from
https://medium.com/nlp-tsupei/precision-recall-f1-score簡單介紹-f87baa82a47

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊