跳到主要內容

臺灣博碩士論文加值系統

(34.204.198.73) 您好!臺灣時間:2024/07/21 15:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:詹志祥
研究生(外文):Zhi-Xiang Zhan
論文名稱:基於深度學習之多概念影片摘要以及陰影著色網路設計
論文名稱(外文):Deep learning-based multi-concept video summarization and shadow shading network design
指導教授:葉家宏葉家宏引用關係
指導教授(外文):Yeh,Chia-Hung
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:112
語文別:英文
論文頁數:57
中文關鍵詞:深度學習卷積神經網路影片摘要陰影去除影像著色
外文關鍵詞:Deep LearningConvolutional Neural NetworkVideo SummarizationShadow RemovalImage Colorization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:23
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著科技的進步,每天被上傳至網路上的影片日漸增多,如何從如此大量的影片中找尋到所需的影片是一個很重要的議題。結合了深度學習的影片摘要方法能夠快速且自動的挑選出影片中至關重要的片段,讓使用者能夠用極短的時間(約原片長的 10%)快速瀏覽並搜尋到自己所需要之影片,然而現有的方法面臨著所生成的影片摘要不夠理想或是在應用上具有一些限制的問題。因此,本論文的研究目的為設計一個可應用於多數影片的影片摘要卷積網路來解決,主要的創新點為透過關注點檢測、影片豐富度判別以及影片分類這三個模組來獲取影片的額外資訊,藉由這些額外資訊來輔助網路給予各個片段合適的重要性分數。此外,本論文也提出基於卷積神經網路的陰影去除方法,可用於前處理階段將影像中的陰影移除以此增進成果表現,本論文的陰影去除方法有別於現有方法使用點亮或是影像生成,透過重新上色的方式將陰影區域移除,本論文所提出的方法相較於以往在陰影跨越兩個色度不同的區域時表現得更好。從實驗成果可證明所提出之影片摘要以及陰影去除方法比國際標竿具更好的成果。
With the advancement of technology, the number of videos uploaded online on a daily
basis has exponentially increased. Finding the desired videos from such a vast amount of
content has become a crucial issue. Combining deep learning techniques, video
summarization methods can efficiently and automatically identify the most relevant
segments in videos, allowing users to quickly browse and search for specific content
within a significantly reduced timeframe (approximately 10% of the original video
duration). However, existing methods face challenges in generating satisfactory video
summaries or have limitations in their practical applications. Therefore, the research
objective of this paper is to design a video summarization convolutional network
applicable to a wide range of videos. The main innovation lies in incorporating three
modules: salient point detection, video richness prediction, and scene classification, to
obtain additional information about the videos. This additional information assists the
network in assigning appropriate importance scores to individual video segments.
Furthermore, this paper proposes a shadow removal method based on convolutional
neural networks. It can be used in the pre-processing stage to remove shadows in the
image to improve the performance of the result. Different from existing methods that
employ brightness adjustment or image generation techniques, this paper introduces a colorization-based approach to remove shadows. The proposed method outperforms
previous approaches, particularly when dealing with shadows that span across two
distinct color regions. Experimental results validate that the proposed video
summarization and shadow removal methods achieve superior performance compared to
the international benchmarks.
Contents
論文審定書....................................................................................................................... i
誌謝.................................................................................................................................. ii
中文摘要......................................................................................................................... iii
Abstract .......................................................................................................................... iv
Contents.......................................................................................................................... vi
List of Figures................................................................................................................ ix
List of Tables................................................................................................................... x
Chapter 1 .......................................................................................................................... 1
1.1 Overview...................................................................................................... 1
1.2 Motivation ................................................................................................... 3
1.3 Contribution................................................................................................ 4
1.4 Organization ............................................................................................... 6
Chapter 2 .......................................................................................................................... 7
2.1 Video summarization ................................................................................. 7
2.2 Shadow Removal ........................................................................................ 9
Chapter 3 ........................................................................................................................ 12
3.1 Proposed Framework............................................................................... 12
3.2 Proposed Feature extraction network .................................................... 13
3.3 Proposed Summarization Network......................................................... 15
3.4 Loss Function............................................................................................ 18
3.5 Key Shot Selection .................................................................................... 19
3.6 Experimental Results ............................................................................... 20
(1) Dataset and Benchmark................................................................... 20
(2) Implement Details............................................................................. 21
(3) Comparison with State-of-the-Art Method.................................... 22
Chapter 4 ........................................................................................................................ 24
3.1 Network Architecture............................................................................... 24
3.2 Proposed Shadow Removal Network...................................................... 26
3.3 Proposed Colorization Network.............................................................. 29
3.4 Loss Function............................................................................................ 31
3.5 Experimental Results ............................................................................... 34
(1) Dataset and Benchmark................................................................... 34
(2) Implement Details............................................................................. 34
(3) Comparison with State-of-the-Art Method.................................... 36
(4) Ablation Study .................................................................................. 36
Chapter 5 ........................................................................................................................ 39
5.1 Conclusions ............................................................................................... 39
5.2 Future Works............................................................................................ 40
Reference ....................................................................................................................... 42
Reference
[1] Elhamifar, E., Clara De Paolis Kaluza, M. “Online summarization via submodular
and convex optimization.” In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (pp. 1783-1791), 2017.
[2] Elhamifar, E., Sapiro, G., Sastry, S. S. “Dissimilarity-based sparse subset selection.”
IEEE transactions on pattern analysis and machine intelligence, 38(11), 2182-
2197, 2015.
[3] Pang, Z., Nakashima, Y., Otani, M., Nagahara, H. “Contrastive Losses Are Natural
Criteria for Unsupervised Video Summarization.” In Proceedings of the IEEE/CVF
Winter Conference on Applications of Computer Vision (pp. 2010-2019), 2023.
[4] Cai, S., Zuo, W., Davis, L. S., Zhang, L. “Weakly-supervised video summarization
using variational encoder-decoder and web prior.” In Proceedings of the European
conference on computer vision (ECCV) (pp. 184-200), 2018.
[5] Panda, R., Das, A., Wu, Z., Ernst, J., Roy-Chowdhury, A. K. “Weakly supervised
summarization of web videos.” In Proceedings of the IEEE International
Conference on Computer Vision (pp. 3657-3666), 2017.
[6] Potapov, D., Douze, M., Harchaoui, Z., Schmid, C. “Category-specific video
summarization.” In Computer Vision–ECCV 2014: 13th European Conference,
Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13 (pp. 540-555).
Springer International Publishing, 2014.
[7] Gong, B., Chao, W. L., Grauman, K., & Sha, F. “Diverse sequential subset selection
for supervised video summarization.” Advances in neural information processing
systems, 2017.
[8] Sharghi, A., Gong, B., Shah, M. “Query-focused extractive video summarization.”
In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The
Netherlands, October 11-14, 2016, Proceedings, Part VIII 14 (pp. 3-19). Springer
International Publishing, 2016.
[9] Sharghi, A., Laurel, J. S., Gong, B. “Query-focused video summarization: Dataset,
evaluation, and a memory network based approach.” In Proceedings of the IEEE
conference on computer vision and pattern recognition (pp. 4788-4797), 2017.
[10] Elhamifar, E., Sapiro, G., Vidal, R. “See all by looking at a few: Sparse modeling
for finding representative objects.” In 2012 IEEE conference on computer vision
and pattern recognition (pp. 1600-1607). IEEE, 2012.
[11] Zhou, K., Qiao, Y., Xiang, T. “Deep reinforcement learning for unsupervised video
summarization with diversity-representativeness reward.” In Proceedings of the
AAAI Conference on Artificial Intelligence (Vol. 32, No. 1), 2018.
[12] Yuan, L., Tay, F. E., Li, P., Zhou, L., Feng, J. “Cycle-SUM: Cycle-consistent
adversarial LSTM networks for unsupervised video summarization.” In Proceedings
of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 9143-9150),
2019.
[13] Chu, W. S., Song, Y., & Jaimes, A. “Video co-summarization: Video
summarization by visual co-occurrence.” In Proceedings of the IEEE conference
on computer vision and pattern recognition (pp. 3584-3592), 2015.
[14] Song, Y., Vallmitjana, J., Stent, A., Jaimes, A. “Tvsum: Summarizing web videos
using titles.” In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 5179-5187), 2015.
[15] Gygli, M., Grabner, H., Riemenschneider, H., & Van Gool, L. “Creating
summaries from user videos.” In Computer Vision–ECCV 2014: 13th European
Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13
(pp. 505-520). Springer International Publishing, 2014.
[16] Zhang, K., Chao, W. L., Sha, F., Grauman, K. “Video summarization with long
short-term memory.” In Computer Vision–ECCV 2016: 14th European
Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part
VII 14 (pp. 766-782). Springer International Publishing, 2016.
[17] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Polosukhin, I. “Attention is all you need.” Advances in neural information
processing systems, 2017.
[18] Bilkhu, M., Wang, S., & Dobhal, T. “Attention is all you need for videos: Selfattention based video summarization using universal transformers.” arXiv preprint
arXiv:1906.02792, 2019.
[19] Chen, Y., Guo, B., Shen, Y., Zhou, R., Lu, W., Wang, W., Suo, X. “Video
summarization with u-shaped transformer.” Applied Intelligence, 1-17, 2022.
[20] Narasimhan, M., Rohrbach, A., & Darrell, T. “Clip-it! language-guided video
summarization.” Advances in Neural Information Processing Systems, 34, 13988-
14000, 2021.
[21] Fan, L., Qiu, S., Zheng, Z., Gao, T., Zhu, S. C., Zhu, Y. “Learning triadic belief
dynamics in nonverbal communication from videos.” In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7312-
7321), 2021.
[22] G. Wu, J. Lin,, C. T. Silva, “IntentVizor: Towards Generic Query Guided Interactive
Video Summarization,” in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 2022.
[23] Jifeng Wang, Xiang Li, and Jian Yang. “Stacked conditional generative adversarial
networks for jointly learning shadow detection and shadow removal.” In
Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pages 1788–1797, 2018.
[24] Liangqiong Qu, Jiandong Tian, Shengfeng He, Yandong Tang, and Rynson WH
Lau. “Deshadownet: A multi-context embedding deep network for shadow
removal.” In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pages 4067–4075, 2017.
[25] Mehdi Mirza and Simon Osindero.” Conditional generative adversarial nets.”
arXiv preprint arXiv:1411.1784, 2014.
[26] Xiaodong Cun, Chi-Man Pun, Cheng Shi.” Towards ghost-free shadow removal
via dual hierarchical aggregation network and shadow matting gan.” In
Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages
10680–10687, 2020.
[27] Xiaowei Hu, Yitong Jiang, Chi-Wing Fu, Pheng-Ann Heng. “Mask–shadowgan:
Learning to remove shadows from unpaired data.” In Proceedings of the
IEEE/CVF International Conference on Computer Vision, pages 2472–2481,
2019.
[28] Hieu Le, Dimitris Samaras. “Physics-based shadow image decomposition for
shadow removal.” IEEE Transactions on Pattern Analysis & Machine
Intelligence, (01):1–1, 2021.
[29] Y. Zhu, J. Huang, X. Fu, F. Zhao, Q. Sun, Z. J. Zha. “Bijective Mapping Network
for Shadow Removal.” In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (pp. 5627-5636), 2022.
[30] Y. Jin, A. Sharma, R. T. Tan. “DC-ShadowNet: Single-Image Hard and Soft Shadow
Removal Using Unsupervised Domain-Classifier Guided Network.” In Proceedings
of the IEEE/CVF International Conference on Computer Vision (pp. 5027-5036),
2021.
[31] Y. Zhu, Z. Xiao, Y. Fang, X. Fu, Z. Xiong, Z. J. Zha. “Efficient Model-Driven
Network for Shadow Removal.” In Association for the Advancement of Artificial
Intelligence, 2022.
[32] Gao, J., Zheng, Q., Guo, Y. “Towards real-world shadow removal with a shadow
simulation method and a two-stage framework”, Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, 2022.
[33] He, K., Zhang, X., Ren, S., & Sun, J. “Deep residual learning for image
recognition.” In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 770-778), 2016.
[34] Yan, P., Li, G., Xie, Y., Li, Z., Wang, C., Chen, T., Lin, L. “Semi-supervised video
salient object detection using pseudo-labels.” In Proceedings of the IEEE/CVF
international conference on computer vision (pp. 7284-7293), 2019.
[35] Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-
Hornung, A. “A benchmark dataset and evaluation methodology for video object
segmentation.” In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 724-732), 2016.
[36] Uijlings, J. R., Van De Sande, K. E., Gevers, T., & Smeulders, A. W. “Selective
search for object recognition.” International journal of computer vision, 104,
154-171, 2013.
[37] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N.,
Polosukhin, I. “Attention is all you need.” Advances in neural information
processing systems, 30, 2017.
[38] Potapov, D., Douze, M., Harchaoui, Z., & Schmid, C. “Category-specific video
summarization.” In Computer Vision–ECCV 2014: 13th European Conference,
Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13 (pp. 540-555).
Springer International Publishing, 2014.
[39] Kingma, D. P., & Ba, J. (2014). “Adam: A method for stochastic optimization.”
arXiv preprint arXiv:1412.6980, 2014.
[40] Wang, X., Nie, X., Liu, X., Wang, B., & Yin, Y. “Modality correlation-based video
summarization.” Multimedia Tools and Applications, 79, 33875-33890, 2020.
[41] Wu, J., Zhong, S. H., & Liu, Y. “Dynamic graph convolutional network for multivideo summarization.” Pattern Recognition, 107, 107382, 2020.
[42] Wu, J., Zhong, S. H., & Liu, Y. “MvsGCN: A novel graph convolutional network
for multi-video summarization.” In Proceedings of the 27th ACM International
Conference on Multimedia (pp. 827-835), 2019.
[43] Sahu, A., & Chowdhury, A. S. “First person video summarization using different
graph representations.” Pattern Recognition Letters, 146, 185-192, 2021.
[44] Fu, H., & Wang, H. “Self-attention binary neural tree for video summarization.”
Pattern Recognition Letters, 143, 19-26, 2021.
[45] Zhu, W., Lu, J., Han, Y., Zhou, J. “Learning multiscale hierarchical attention for
video summarization.” Pattern Recognition, 122, 108312, 2022.
[46] Li, P., Ye, Q., Zhang, L., Yuan, L., Xu, X., Shao, L. “Exploring global diverse
attention via pairwise temporal relation for video summarization.” Pattern
Recognition, 111, 107677, 2021.
[47] Zhou, K., Qiao, Y., Xiang, T. “Deep reinforcement learning for unsupervised video
summarization with diversity-representativeness reward.” In Proceedings of the
AAAI Conference on Artificial Intelligence (Vol. 32, No. 1), 2018.
[48] G. D. Finlayson, M. S. Drew, and C. Lu, “Entropy minimization for shadow
removal,” International Journal of Computer Vision, 2009.
[49] Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. “Instance normalization: The
missing ingredient for fast stylization.” arXiv preprint arXiv:1607.08022, 2016.
[50] Vinod Nair, Geoffrey E Hinton. “Rectified linear units improve restricted
boltzmann machines.” In Icml, 2010.
[51] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. “Image
quality assessment: from error visibility to structural similarity.” IEEE
transactions on image processing, 13(4):600–612, 2004.
[52] Guo, L., Huang, S., Liu, D., Cheng, H., & Wen, B. “Shadowformer: Global context
helps image shadow removal.” arXiv preprint arXiv:2302.01650, 2023.
[53] Liu, J., Wang, Q., Fan, H., Li, W., Qu, L., & Tang, Y. “A Decoupled Multi-Task
Network for Shadow Removal.” IEEE Transactions on Multimedia, 2023.
電子全文 電子全文(網際網路公開日期:20280824)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊