跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.87) 您好!臺灣時間:2025/03/17 13:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:盧一銘
研究生(外文):Lu, Yi-Ming
論文名稱:利用時空關聯性提升單視角深度估測一致性之深度學習網路
論文名稱(外文):Consistency-Enhanced Monocular Depth Estimation DNN with Temporal-Spatio Interdependencies
指導教授:章定遠
指導教授(外文):Chan, Din-Yuen
學位類別:碩士
校院名稱:國立嘉義大學
系所名稱:資訊工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:110
語文別:中文
論文頁數:41
中文關鍵詞:深度估測預先訓練密集連接卷積網路非局部模塊時空關聯性
外文關鍵詞:depth estimationpre-trainedDenseNetNL blocktemporal-spatio correlation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:117
  • 評分評分:
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:0
因近期自駕車技術成熟及協作機器人需求增加,使深度估測成為熱門議題之一,其透過生成深度影像,提供物體遠近的相關資訊。目前多數深度學習網路(Deep-learning neural networks, DNN),為提升高準確性而趨於複雜化結構,導致資源耗損增加及硬體設備需求提高。本論文目標為輕量化現有單視角深度估測之DNN。做法為利用時空關聯性提升單視角深度估測DNN之一致性,架構選擇將密集連接卷積網路(Densely connected convolutional networks, DenseNet)作為預先訓練(pre-trained)權重之骨幹網路,並運用嵌入非局部模塊(non-local block, NL block),擷取影像之時空關聯特徵,最後逐步簡化模型結構,降低模型複雜度。根據實驗結果,於拍攝一般道路和複雜室內等資料集,都能維持主要物件深度一致性和減少模型所需資源,呈現實際堪用之深度圖。
Due to the recently matured of self-driving technology and increasing demand for collaborative robots, depth estimation has become one of the hot topics. It provides information of the distance from objects by generating depth images. At present, most deep-learning neural networks (DNN) tend to be complicated in order to achieve high accuracy, resulting in increased resource consumption and increased demand for hardware equipment. The goal of this paper is to lighten the existing monocular depth estimation DNNs. The approach is to improve the consistency of monocular depth estimation DNNs by using temporal-spatio correlation. The architecture chooses dense connected convolutional networks (DenseNet) as the backbone network of pre trained weights, and uses embedded non local block (NL block) to capture the temporal-spatio correlation features of images. Finally, the model structure is gradually simplified and the complexity of the model is reduced. According to the experimental results, the depth consistency of main objects can be maintained, the resources required by the model can be reduced, and the actual depth map can be presented while testing with the datasets of scenes with general roads and complex indoor.
目錄
摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
第1章 簡介 1
1.1 研究背景 1
1.2 研究動機 1
1.3 論文結構 2
第2章 相關研究 3
2.1 第一類單視角深度估測之DNNs 3
2.2 第二類單視角深度估測之DNNs 4
2.3 第三類單視角深度估測之DNNs 5
2.4 非局部模塊 6
2.5 資料集 8
第3章 研究方法 11
3.1 系統概述 11
3.2 Non-local block簡介 16
3.3 結合Non-local block之密集連接卷積網路 17
3.4 擷取時空關聯性特徵方式 20
3.5 損失函數 22
3.6 訓練資料 23
第4章 實驗結果與分析 26
4.1 實驗軟硬體概述 26
4.2 實驗結果與比較 27
第5章 結論 37
5.1 結論 37
5.2 未來展望 37
參考文獻 39

圖目錄
圖2.1 : Encoder與Decoder[8] 4
圖2.2 : 非局部均值示意圖 7
圖2.3 : 視訊中利用前後影像抓取特徵[27] 7
圖2.4 : 非局部模塊結構[27] 8
圖2.5 : KITTI [30]資料集,上圖為原始影像,下圖為稀疏且解析度很低的深度以紅色表示偵測物件 9
圖2.6 : NYU Depth V2 [31]資料集 10
圖3.1 : 空間關聯性特徵之整體系統架構圖 12
圖3.2 : 時空關聯性特徵之Encoder系統架構 12
圖 3.3 : Non-local block 17
圖3.4 : 各種型態之密集連接卷積網路[32] 18
圖3.5 : Dense block 18
圖3.6 : NL block嵌入Dense block 19
圖3.7 : 空間關聯性特徵之結果,以灰階呈現,紅框標示主要物件,NL block為non-local bock縮寫 20
圖3.8 : 時空關聯性特徵之結果,以灰階呈現,紅框標示主要物件,NL block為non-local bock縮寫 21
圖3.9 : NYU Depth V2 [31]資料集 24
圖3.10 : KITTI [30]資料集,上圖為原始影像,中圖為稀疏且解析度很低的深度以紅色表示偵測物件,下圖為真實深度 25
圖4.1 : 空間域特徵之染色結果圖,紅框標示主要物件,NL block為non-local bock縮寫 28
圖4.2:應用訓練資料擴充之染色結果圖,紅框標示主要物件,NL block為non-local bock縮寫 29
圖4.3 : 空間關性特徵於各式DenseNet-121之染色圖,紅框標示主要物件,NL block為non-local bock縮寫 32
圖4.4 : 於低光環境擷取空間關聯性特徵 33
圖4.5:利用時空關聯性特徵深度估測之染色結果圖,NL block為non-local bock縮寫 35

表目錄
表3.1 : Encoder之詳細架構 13
表3.2 : Decoder之詳細架構 13
表3.3 : Encoder#1之詳細架構 14
表3.4 : Encoder#2之詳細架構 14
表3.5 : Decoder之詳細架構 15
表4.1 : 實驗軟硬體 26
表4.2 : 實驗參數設定 27
表4.3 : 骨幹網路 28
表4.4 : Threshold評估空間域之實驗結果 30
表4.5 : rel、rms和log10評估空間域之實驗結果 31
表4.6 : 輕量化DenseNet-121之各項指標 31
表4.7:空間域之Threshold與各種定量值比較 34
表4.8:時域之Threshold與各種定量值的實驗結果評估 35
表4.9:時域之Threshold與各種定量值比較 36
[1] D. Gerónimo, A. M. López, A. D. Sappa and T. Graf, "Survey of Pedestrian Detection for Advanced Driver Assistance Systems," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 7, pp. 1239-1258, 2010.

[2] I. P. Howard and B. J. Rogers, "Binocular vision and stereopsis, " Oxford Uversity Press, USA, 1995.

[3] D. Eigen, C. Puhrsch, and R. Fergus, “Depth Map Prediction from a Single Image Using a Multi-Scale Deep Network,” Advances in Neural Information Processing Systems 27 (NIPS), Dec. 2014.

[4] M. Song and W. Kim, "Depth Estimation from a Single Image Using Guided Deep Network," EEE Access 7, pp. 142595-142606, 2019.

[5] Y. Cao, Z. Wu, and C. Shen, “Estimating Depth from Monocular Images as Classification Using Deep Fully Convolutional Residual Networks,” IEEE Trans. Circuits and Systems for Video Technology, vol. 28, no. 11, pp. 3174–3182, Nov. 2018.

[6] Y. Kim, H. Jung, D. Min, and K. Sohn, “Deep Monocular Depth Estimation via Integration of Global and Local Predictions,” IEEE Trans. Image Processing, vol. 27, no. 8, pp. 4131–4144, Aug. 2018.

[7] H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao, "Deep Ordinal Regression Network for Monocular Depth Estimation, "Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002-2011, 2018.

[8] I. Alhashim and P. Wonka, "High Quality Monocular Depth Estimation via Transfer Learning," arXiv preprint arXiv:1812.11941, 2018.

[9] J. Hu, M. Ozay, Y. Zhang, and T. Okatani, “Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries,” WACV, Waikoloa Village, HI, USA, pp. 1043-1051, March 2019.

[10] S. Miangoleh, S. Dille, L. Mai, S. Paris and Y. Aksoy, "Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging", 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[11] F. Liu, C. Shen, G. Lin, and I. D. Reid, ” Learning depth from single monocular images using deep convolutional neural fields,“ IEEE Trans. Pattern Analysis and Machine Intelligence,38(10):2024–2039, 2016.

[12] C. Godard, O. Aodha and G. Brostow, "Unsupervised Monocular Depth Estimation with Left-Right Consistency", 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[13] A. Pilzer, S. Lathuili`ere, N. Sebe, and E. Ricci, “Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation,” CVPR, pp. 9768-9777, June 2019.

[14] A. Wong and S. Soatto, “Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction,” CVPR, Open Access paper, pp. 5644-5653, June 2019.

[15] X. Ye, X. Fan, M. Zhang, R. Xu, and W. Zhong, “ Unsupervised Monocular Depth Estimation via Recursive Stereo Distillation, “ IEEE Trans. Image Processing , Vol. 30, pp.4492-4504, 2021.

[16] M. Yucel, V. Dimaridou, A. Drosou and A. Saa-Garriga, "Real-time Monocular Depth Estimation with Sparse Supervision on Mobile", 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021.

[17] Y. Kuznietsov, J. Stuckler, and B. Leibe,” Semi-supervised deep learning for monocular depth map prediction,” In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, July 2017.

[18] J. Liu, et. al. “Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation,” IEEE Trans. Neural Networks and Learning Systems, vol. 29, no. 11, pp. 5655–5666, Nov. 2018.

[19] J. Jiao, Y. Cao, Y. Song, and R. Lau, “Look Deeper into Depth: Monocular Depth Estimation with Semantic Booster and Attention-Driven Loss,” ECCV, Open Access paper, pp. 53-69, Sept. 2018.

[20] P. Z. Ramirez, M. Poggi, and F. Tosi, “Geometry meets semantics for semi-supervised monocular depth estimation,“ 14th Asian Conference on Computer Vision (ACCV), pp. 298-313, Dec. 2018.

[21] H. Tian and F. Li, “Semi-Supervised Depth Estimation from a Single Image Based on Confidence Learning,” ICASSP, United Kingdom, pp. 8573-8577, May 2019.

[22] P. Y. Chen, A. H. Liu, Y. C. Liu, and Y.-C. F. Wang, “Towards Scene Understanding: Unsupervised Monocular Depth Estimation with Semantic-aware Representation,” CVPR open-access paper, pp. 2624-2632, June 2019.

[23] C. Godard, O. M. Aodha, M. Firman, and G Brostow, “Digging into Self-Supervised Monocular Depth Estimation,” ICCV, Open Access paper, pp. 3828-3838, Oct. 2019.

[24] Z. Zhang, Z. Cui, C. Xu, Y.Yan, N. Sebe, and Jian Yang “Pattern-affinitive propagation across depth, surface normal and semantic segmentation,” CVPR, 2019.

[25] J. Choi, D. Jung, D. Lee, and C. Kim, “ SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction, “ CVGIP, 2020. arXiv:2010.02893v3

[26] J. Hu, X. Guo, J. Chen, G. Liang, F. Deng and T. Lam, "A Two-Stage Unsupervised Approach for Low Light Image Enhancement", IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 8363-8370, 2021. Available: 10.1109/lra.2020.3048667 [Accessed 9 November 2021].

[27] X. Wang, R. Girshick, A. Gupta and K. He, "Non-local Neural Networks", 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.

[28] A. Buades, B. Coll and J. Morel, "A Non-Local Algorithm for Image Denoising", 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29] G. Li, X. He, W. Zhang, H. Chang, L. Dong and L. Lin, "Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining", Proceedings of the 26th ACM international conference on Multimedia, 2018.

[30] A. Geiger, P. Lenz, C. Stiller and R. Urtasun, "Vision meets robotics: The KITTI dataset", The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013.

[31] N. Silberman, D. Hoiem, P. Kohli and R. Fergus, "Indoor Segmentation and Support Inference from RGBD Images", Computer Vision – ECCV 2012, pp. 746-760, 2012.

[32] G. Huang, Z. Liu, L. Van Der Maaten and K. Weinberger, "Densely Connected Convolutional Networks", 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[33] L. Huynh, P. Nguyen-Ha, J. Matas, E. Rahtu and J. Heikkilä, "Guiding Monocular Depth Estimation Using Depth-Attention Volume", Computer Vision – ECCV 2020, pp. 581-597, 2020.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top