跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.85) 您好!臺灣時間:2024/12/07 02:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳澔緯
研究生(外文):Hao-Wei Chen
論文名稱:利用卷積類神經網路以色彩資訊及光流進行影片物體分割
論文名稱(外文):Video Object Segmentation Using Appearance and Optical Flowwith Convolutional Neural Network
指導教授:莊永裕
口試日期:2017-07-31
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊網路與多媒體研究所
學門:電算機學門
學類:網路學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:22
中文關鍵詞:物體分割卷積類神經網路條件隨機域
外文關鍵詞:object segmentationconvolution neural networksconditional random field
相關次數:
  • 被引用被引用:0
  • 點閱點閱:333
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本篇論文探討部分監督式影片物體分割演算法,此問題是給定第一幀的物體分割資訊,要求解剩下每一幀此物體之分割結果。我們不同於以往方法,結合影片中的色彩資訊及光流資訊當作輸入來訓練卷積類神經網路,提出了合併架構及分別訓練兩種方法,以及採用分次訓練的策略,首先使用訓練資料訓練好模型,在測試時使用每段影片的第一幀來進行加強學習,最後使用條件隨機域來後處理我們得到的分割結果。我們也做了一些實驗來比較不同訓練條件或是後處理方法得到之結果不同。最後我們最佳的方法在 DAVIS 此影片物體分割資料集中得到了 81.2%的精準度,優於當前最佳技術的 79.8%。
This thesis is about the task of semi-supervised video object segmentation. That is, the segmentation of an object from the video given the mask of the first frame. We combine the appearance and the optical flow as our convolution neural network’s input and propose two methods to solve this problem. And we use the offline / online training strategy to fine-tune the model with first frame annotation at the test time. Finally, we use the CRF as our refinement. We also do some ablation study to compare the results with the different conditions. And our best algorithm improves the state of the art from 79.8% to 81.2%.
口試委員審定書 i
誌謝 ii
中文摘要 iii
Abstract iv
Contents v
List of Figures vi
List of Tables vii
Chapter 1 Introduction 1
Chapter 2 Related Work 3
2.1 Unsupervised Video Object Segmentation 3
2.2 semi-supervised Video Object Segmentation 3
2.3 Supervised Video Object Segmentation 4
Chapter 3 Methodology 5
3.1 Review of OSVOS 5
3.2 Review of MaskTrack 7
3.3 2-stream architecture 9
3.4 Training details 11
Chapter 4 Experiments and Results 14
Chapter 5 Conclusion 19
Biography 20
[1] S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixe, D. Cremers, and L. V. Gool. One-shot video object segmentation. In CVPR, 2017. [2] F. Perazzi, J. Pont-Tuset, B. McWilliams, L. V. Gool, M. Gross, and A. Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In CVPR, 2016.[3] J. Chang, D. Wei, and J. W. Fisher III. A video representation using temporal superpixels. In CVPR, 2013. [4] M. Grundmann, V. Kwatra, M. Han, and I. A. Essa. Effi- cient hierarchical graph-based video segmentation. In CVPR, 2010. [5] S. A. Ramakanth and R. V. Babu. Seamseg: Video object segmentation using patch seams. In CVPR, 2014. [6] Q. Fan, F. Zhong, D. Lischinski, D. Cohen-Or, and B. Chen. Jumpcut: Non-successive mask transfer and interpolation for video cutout. ACM Trans. Graph., 34(6), 2015. [7] F. Perazzi, O. Wang, M. Gross, and A. Sorkine-Hornung. Fully connected object proposals for video segmentation. In ICCV, 2015. [8] N. Nicolas Marki, F. Perazzi, O. Wang, and A. Sorkine Hornung. Bilateral space video segmentation. In CVPR, 2016. [9] A. Faktor and M. Irani. Video segmentation by non-local consensus voting. In BMVC, 2014. [10] A. Papazoglou and V. Ferrari. Fast object segmentation in unconstrained video. In ICCV, 2013. 21 [11] P. Tokmakov, K. Alahari, and C. Schmid. Learning motion patterns in videos. arXiv:1612.07217, 2016. [12] S. D. Jain, B. Xiong, and K. Grauman. Fusionseg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv:1701.05384, 2017. [13] W. Wang and J. Shen. Super-trajectory for video segmentation. arXiv:1702.08634, 2017. [14] Y.-H. Tsai, M.-H. Yang, and M. J. Black. Video segmentation via object flow. In CVPR, 2016. [15] A. Khoreva, F. Perazzi, R. Benenson, B. Schiele, and A. Sorkine-Hornung. Learning video object segmentation from static images. In CVPR, 2017. [16] V. Jampani, R. Gadde, and P. V. Gehler. Video propagation networks. In CVPR, 2017 [17] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015. [18] A. Faktor and M. Irani. Video segmentation by non-local consensus voting. In BMVC, 2014. [19] J. Shen, W. Wenguan, and F. Porikli. Saliency-Aware geodesic video object segmentation. In CVPR, 2015. [20] B. Taylor, V. Karasev, and S. Soatto. Causal video object segmentation from persistence of occlusions. In CVPR, 2015. [21] F. Perazzi, P. Krahenb ‥ uhl, Y. Pritch, and A. Hornung. ‥ Saliency filters: Contrast based filtering for salient region detection. In CVPR, 2012. [22] S. D. Jain and K. Grauman. Click carving: Segmenting objects in video with point clicks. In HCOMP, 2016. [23] T. V. Spina and A. X. Falcao. Fomtrace: Interactive video segmentation by image graphs and fuzzy object models. arXiv preprint arXiv:1606.03369, 2016. 22 [24] Q. Fan, F. Zhong, D. Lischinski, D. Cohen-Or, and B. Chen. Jumpcut: Non-successive mask transfer and interpolation for video cutout. SIGGRAPH Asia, 2015. [25] F. Zhong, X. Qin, Q. Peng, and X. Meng. Discontinuityaware video object cutout. TOG, 2012. [26] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. [27] R. Mottaghi, X. Chen, X. Liu, N.-G. Cho, S.-W. Lee, S. Fidler, R. Urtasun, and A. Yuille. The role of context for object detection and semantic segmentation in the wild. In CVPR, 2014. [28] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Con- ’ tour detection and hierarchical image segmentation. TPAMI, 33(5):898–916, 2011. [29] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv:1606.00915, 2016. [30] J. Revaud, P. Weinzaepfel, Z. Harchaoui, and C. Schmid. Epicflow: Edge-preserving interpolation of correspondences for optical flow. In CVPR, 2015. [31] Simonyan, K. and Zisserman, A. Two-stream convolutional networks for action recognition in videos. CoRR, abs/1406.2199, 2014. Published in Proc. NIPS, 2014. [32] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR, 2017. [33] P. Krahenbuhl and V. Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In NIPS.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top