臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.134) 您好！臺灣時間：2025/11/20 21:39

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

蔡侑儒

研究生(外文):

Yu-Ju Tsai

論文名稱:

使用深度學習預測光場圖像深度分布

論文名稱(外文):

Estimate Disparity of Light Field Images by Deep Neural Network

指導教授:

歐陽明

口試委員:

莊永裕、葉正聖

口試日期:

2019-06-11

學位類別:

碩士

校院名稱:

國立臺灣大學

系所名稱:

資訊工程學研究所

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2019

畢業學年度:

107

語文別:

英文

論文頁數:

中文關鍵詞:

光場、深度學習、深度估計

DOI:

10.6342/NTU201900934

相關次數:

被引用:0
點閱:123
評分:
下載:0
書目收藏:0

在本論文中，我們使用深度學習來預測光場圖像的深度。光場相機可以同時拍攝一個場景的光線特性：包含空間以及角度。藉由拍攝到的這些資訊，我們可以去預估場景的深度。但是光場相機結構上圖像與圖像之間狹窄的baseline造成深度估計上面的困難，現在很多方法都想要解決這個硬體上面的限制，不過仍然需要在執行速度以及估計的準確率上面達到平衡。
因此，本論文考慮了光場圖像在資料上面的結構性以及圖像上的重複性，將這些特性的概念設計到我們的深度學習網路當中。再來我們提出了attention based sub-aperture view selection來讓網路自行學習哪一些圖像對於深度估計的貢獻是更大的，最後我們比較了在benchmark上和其他states of the art方法之間的比較，來顯示我們對於這個題目的改進。

In this paper, we introduce a light field depth estimation method based on a convolutional neural network. Light field camera can capture the spatial and angular properties of light in a scene. By using this property, we can compute depth information from light field images. However, the narrow baseline in light-field cameras makes the depth estimation of light field difficult. Many approaches try to solve these limitations in the depth estimation of the light field, but there is some trade-off between the speed and the accuracy in these methods.
We consider the repetitive structure of the light field and redundant sub-aperture views in light field images. First, to utilize the repetitive structure of the light field, we integrate this property into our network design. Second, by applying attention based sub-aperture views selection, we can let the network learn more useful views by itself. Finally, we compare our experimental result with other states of the art methods to show our improvement in light field depth estimation.

誌謝ii
摘要iii
Abstract iv
1 Introduction 1
2 Related Work 4
2.1 Traditional Methods 4
2.2 Deep Learning Methods 5
3 Method 7
3.1 Network Design 7
3.2 Feature Extraction and SPP module 8
3.3 Cost Volume Construction 9
3.4 Attention-based View Selection 9
3.5 3D CNN and Disparity Regression 10
4 Experiment 12
4.1 Training Dataset 12
4.1.1 4D Light Field Dataset 12
4.2 Implementation Detail 14
4.3 Quantitative Evaluation 15
4.4 Comparison to state-of-the-art methods 17
5 Conclusion 34
Bibliography 35

[1] E. H. Adelson and J. Y. A. Wang. Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):99–106,Feb 1992. 1
[2] A. Alperovich, O. Johannsen, M. Strecke, and B. Goldluecke. Light field intrinsics with a deep encoder-decoder network. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9145–9154, June 2018. 5, 12, 14
[3] R. C. Bolles, H. H. Baker, and D. H. Marimont. Epipolarplane image analysis: An approach to determining structure from motion. In INTERN..1. COMPUTER VISION, pages 1–7, 1987. 4
[4] J.-R. Chang and Y.-S. Chen. Pyramid stereo matching network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5410–5418,2018. 7, 8, 9, 10, 11
[5] C. Chen, H. Lin, Z. Yu, S. B. Kang, and J. Yu. Light field stereo matching using bilateral statistics of surface cameras. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 1518–1525, June 2014. 5
[6] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. The lumigraph. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pages 43–54, New York, NY, USA, 1996. ACM. 4
[7] K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1904–1916, Sep. 2015. 7, 8
[8] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, June 2016. 7
[9] S. Heber and T. Pock. Convolutional networks for shape from light field. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3746–3754, June 2016. 5
[10] S. Heber, W. Yu, and T. Pock. Neural epi-volume networks for shape from light field. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2271–2279, Oct 2017. 5
[11] K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke. A dataset and evaluation methodology for depth estimation on 4d light fields. In S.-H. Lai, V. Lepetit, K. Nishino, and Y. Sato, editors, Computer Vision – ACCV 2016, pages 19–34, Cham, 2017. Springer International Publishing. 12, 15
[12] Jingyi Yu, L. McMillan, and S. Gortler. Scam light field rendering. In 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings., pages 137–144, Oct 2002. 5
[13] O. Johannsen, A. Sulc, and B. Goldluecke. What sparse light field coding reveals about scene structure. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3262–3270, June 2016. 17
[14] N. K. Kalantari, T.-C. Wang, and R. Ramamoorthi. Learning-based view synthesis for light field cameras. ACM Trans. Graph., 35(6):193:1–193:10, Nov. 2016. 5
[15] A. Kendall, H. Martirosyan, S. Dasgupta, and P. Henry. End-to-end learning of geometry and context for deep stereo regression. pages 66–75, 10 2017. 7, 9, 10, 11
[16] J. Y. Lee and R. H. Park. Depth estimation from light field by accumulating binary maps based on foreground-background separation. IEEE Journal of Selected Topics in Signal Processing, 11(7):955–964, Oct 2017. 17
[17] M. Levoy and P. Hanrahan. Light field rendering. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pages 31–42, New York, NY, USA, 1996. ACM. 4
[18] C. Shin, H. Jeon, Y. Yoon, I. S. Kweon, and S. J. Kim. Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4748–4757, June 2018. 5, 15, 17
[19] P. P. Srinivasan, T. Wang, A. Sreelal, R. Ramamoorthi, and R. Ng. Learning to synthesize a 4d rgbd light field from a single image. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2262–2270, Oct 2017. 5
[20] M. W. Tao, S. Hadap, J. Malik, and R. Ramamoorthi. Depth from combining defocus and correspondence using light-field cameras. In 2013 IEEE International Conference on Computer Vision, pages 673–680, Dec 2013. 5
[21] T.-C. Wang, J.-Y. Zhu, E. Hiroaki, M. Chandraker, A. A. Efros, and R. Ramamoorthi. A 4d light-field dataset and cnn architectures for material recognition. In B. Leibe, J. Matas, N. Sebe, and M. Welling, editors, Computer Vision – ECCV 2016, pages 121–138, Cham, 2016. Springer International Publishing. 5
[22] S. Wanner and B. Goldluecke. Globally consistent depth labeling of 4d light fields. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 41–48, June 2012. 4, 17
[23] S. Wanner and B. Goldluecke. Variational light field analysis for disparity estimation and super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3):606–619, March 2014. 4
[24] Wikipedia contributors. Gabriel lippmann — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Gabriel_Lippmann&oldid=879639602, 2019. [Online; accessed 15-May-2019]. 1
[25] Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. Learning a deep convolutional network for light-field image super-resolution. In 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), pages 57–65, Dec 2015. 5
[26] Y. Yoon, H. Jeon, D. Yoo, J. Lee, and I. S. Kweon. Light-field image super-resolution using convolutional neural network. IEEE Signal Processing Letters, 24(6):848–852, June 2017. 5
[27] Z. Yu, X. Guo, H. Ling, A. Lumsdaine, and J. Yu. Line assisted light field triangulation and stereo matching. In 2013 IEEE International Conference on Computer Vision, pages 2792–2799, Dec 2013. 5
[28] J. Zbontar and Y. LeCun. Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research, 17:1–32, 2016. 9
[29] S. Zhang, H. Sheng, C. Li, J. Zhang, and Z. Xiong. Robust depth estimation for light field via spinning parallelogram operator. Comput. Vis. Image Underst., 145(C):148–159, Apr. 2016. 4, 17
[30] Y. Zhang, H. Lv, Y. Liu, H. Wang, X. Wang, Q. Huang, X. Xiang, and Q. Dai. Lightfield depth estimation via epipolar plane image analysis and locally linear embedding.IEEE Transactions on Circuits and Systems for Video Technology, 27(4):739–747, April 2017. 4
[31] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6230–6239, July 2017. 8
[32] T. Zhong, X. Jin, L. Li, and Q. Dai. Light field image compression using depth-based cnn in intra prediction. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8564–8567, May 2019. 5

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於深度學習之人像淺景深模擬
2.	通過層級特徵融合增強無監督單眼深度估計
3.	利用領域調適技術轉移抽象場景結構資訊以提升語意式影象切割與深度估計
4.	基於遮蔽感知深度量測之光場相機深度擷取系統

無相關期刊

1.	具寬輸入電壓範圍低切換損耗之雙模式降壓轉換器
2.	基於卷積神經網路與色彩感知的水彩混色模型
3.	鬆動的秩序：李漁小說的倫理敘事
4.	使用影像眼震圖構建主動數碼動暈症檢測器
5.	陣列式壓電能量擷取於同步電荷提取電路架構下之實驗研究
6.	全頻聲子蒙地卡羅法開發
7.	多顆微粒子之介電泳交越頻率的量測與模擬分析
8.	使用自旋轉移力矩式記憶體為基礎的物理性不可複製功能之物聯網裝置認證機制
9.	基於Java開源軟體的單元服務生成
10.	利用機器學習方法偵測受動態電壓降影響之危急時序路徑
11.	學習高速且高效率的深度學習的迴歸應用
12.	基於社群媒體意見預測市場偏好服務套裝
13.	NRIP在神經肌肉接合處可作為乙醯膽鹼受體複合物的結構性蛋白
14.	論我國公司治理人員制度之引進與建構
15.	打破「偵查主體」爭奪戰的迷思

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室