(3.238.174.50) 您好!臺灣時間:2021/04/18 02:36
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:蔡博文
研究生(外文):BO-WEN TASI
論文名稱:基於深度學習重建YouTube車禍影片三維場景之研究
論文名稱(外文):3D Scene Reconstruction from YouTube Car Accident Videos Using Deep Neural Networks
指導教授:陳郁堂
指導教授(外文):Yie-Tarng Chen
口試委員:陳郁堂方文賢陳省隆林銘波呂政修
口試委員(外文):Yie-Tarng ChenWen-Hsien FangHsing-Lung ChenMing-Bo LinJenq-Shiou Leu
口試日期:2019-07-24
學位類別:碩士
校院名稱:國立臺灣科技大學
系所名稱:電子工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:36
中文關鍵詞:距離估測物件偵測物件追蹤車道分割
外文關鍵詞:Distance EstimationObject DetectionObject TrackingLane Segmentation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:53
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著汽車科技的發展,車禍的預防越來越受到重視。近年來,越來越多團體開始收集和分析車禍資料。在所有的行車資訊當中,最容易取得的便是那些被大量上傳到YouTube上的行車紀錄影片。然而,YouTube上的影片並不像車載電腦一樣,記錄著行車速度、與其他車輛距離等資訊,甚至連錄製影片相機的參數(焦距、焦點、相機內參數)都無法取得。因此,在本論文中,我們利用物件追蹤、車道偵測和透視變換等最先進的方案來重建移動車輛周圍的三維場景。具體而言,YouTube上的車禍影片被映射到具有真實距離信息的鳥瞰坐標圖。為此,我們開發了一種新的深度估計方法,該方法基於車道線經驗法則來恢復深度信息。如果影片不能提供精確的車道線信息,我們所提出的方法便不能很好地運行。
With the development of self-driving cars, the car accident detection and prevention based on deep learning technologies have received attention in recent years. To this end, collecting and analyzing car accident datasets becomes a critical issue. and the car accident videos from YouTube provide valuable resource. Hence, the objective of this research is to develop a tool for automatically reconstruction the 3D scenes from massive car accident videos. Specifically, we attempt to estimate the depth information of moving vehicles for the front-view images. However, these videos from YouTube do not provide intrinsic and extrinsic camera parameters such as the focal length, focal point, which are critical information for 3D scene reconstruction. Consequently, the existing depth estimation approaches based on structure from motion and deep learning cannot be used in our cases. To fill this gap, in this thesis, we leveraged state-of-the art approaches in object tracking, lane detection and inverse perspective transformation to develop a novel depth estimation approach, which can restore the depth information by a lane line heuristic. First, we use the object detector and object tracker to extract moving objects and then apply the Mask R-CNN to detect lane markings from the front-view images. Subsequently, the inverse perspective transform is used to generate a bird-view Image, where a lane line heuristic is used to estimate the depth information for each vehicle. Furthermore, to accelerate 3D scene reconstruction for massive videos, we also investigate a new approach to automatically select the region of interest in the inverse perspective transform. The proposed approach depends on the precise lane line information.
中文摘要 .iii
Abstract .iv
Acknowledgment .v
Table of contents .vi
List of Figures .viii
1 Introduction .1
2 Related Work .3
2.1 Object Detection .3
2.2 Multple Object Tracking .3
2.3 Object Segmentation .4
2.4 Unsupervised Depth Estimation .4
3 Proposed 3D Scene Reconstruction Method .5
3.1 Lane Lines Segmentation from Front View Images .6
3.1.1 Lane Segmentation : Mask R-CNN .6
3.1.2 Color Filter .7
3.1.3 Lane Segmentation Refinement .7
3.1.4 Lane Line Extrapolation .9
3.1.5 Outlier Removal .9
3.2 Inverse Perspective Transformation .10
3.3 Region-of-Interest Selection .14
3.4 Distance Estimation in a Bird View Map .16
4 Experimental Result and Analysis .17
4.1 Datasets .17
4.1.1 Berkeley Deep Drive - Drivable Area .17
4.1.2 Industrial Technology Research Institute (ITRI) Dataset .18
4.1.3 YouTube Car Accident Dataset .18
4.2 Distance Estimation in a Bird View Map .20
4.3 Finding Lane Lines from Front View Images .22
4.4 Outlier Removal .24
4.5 Region-of-Interest Selection .27
5 Conclusion .34
References .35
[1] J. Mezirow, "Perspective transformation," Adult Education, vol. 28, pp. 100-110, Jan 1978.
[2] F. Yu, W. Xian, Y. Chen, F. Liu, M. Liao, V. Madhavan, and T. Darrell, "Bdd100k: A diverse driving video database with scalable annotation tooling," arXiv preprint arXiv:1805.04687, May 2018.
[3] A. Geiger, P. Lenz, and R. Urtasun, "Are we ready for autonomous driving? the kitti vision benchmark suite," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012.
[4] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, "Ssd: Single shot multibox detector," in European Conference on Computer Vision, pp. 21-37, Springer, 2016.
[5] J. Redmon and A. Farhadi, "Yolo9000: better, faster, stronger," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, 2017.
[6] R. Girshick, "Fast r-cnn," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, 2015.
[7] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," in Advances in neural information processing systems, pp. 91-99, 2015.
[8] D. Gordon, A. Farhadi, and D. Fox, "Re^3: Real-time recurrent regression networks for visual tracking of generic objects," IEEE Robotics and Automation Letters, vol. 3, pp. 788-795, Jan 2018.
[9] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al., "Imagenet large scale visual recognition challenge," International Journal of Computer Vision, vol. 115, pp. 211-252, Dec 2015.
[10] A. W. Smeulders, D. M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, and M. Shah, "Visual tracking: An experimental survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, pp. 1442-1468, Nov 2013.
[11] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE International Conference on Computer Vision, pp. 2961-2969, 2017.
[12] R. Garg, V. K. BG, G. Carneiro, and I. Reid, "Unsupervised cnn for single view depth estimation: Geometry to the rescue," in European Conference on Computer Vision, pp. 740-756, Springer, 2016.
[13] C. Godard, O. Mac Aodha, and G. J. Brostow, "Unsupervised monocular depth estimation with left-right consistency," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 270-279, 2017.
[14] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, "Unsupervised learning of depth and ego-motion from video," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851-1858, 2017.
電子全文 電子全文(網際網路公開日期:20240819)
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔