跳到主要內容

臺灣博碩士論文加值系統

(44.221.73.157) 您好!臺灣時間:2024/06/15 12:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蔡慶溢
研究生(外文):Ching-Yi Tsai
論文名稱:單鏡頭深度預測與手術煙霧去除於內視鏡手術
論文名稱(外文):Monocular Depth Estimation and Surgical Smoke Removal for Endoscopic surgery
指導教授:施吉昇
指導教授(外文):JI-SHENG SHI
口試委員:楊宗霖劉宗德楊家驤陳炳宇
口試委員(外文):ZONG-LIN YANGZONG-DE LIUJIA-XIANG YANGBING-YU CHEN
口試日期:2020-08-11
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:英文
論文頁數:37
中文關鍵詞:內視鏡影像單鏡頭深度預測煙霧去除
外文關鍵詞:endoscopic imagesmonocular depth estimationsmoke removal
DOI:10.6342/NTU202004195
相關次數:
  • 被引用被引用:0
  • 點閱點閱:103
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
深度預測對於手術輔助系統非常重要,它提供了三度空間的資訊來輔助其他用於系統中的演算法。機器學習的方法對於單眼深度預測有良好的表現,但是它們大多數需要深度的標記(ground truth)來監督學習,使用機器學習方法對內視鏡影響進行深度預測有兩大挑戰。一為不易取得正確深度標記,另一為手術過程中的煙霧干擾。在這篇論文中,我們提出一個利用立體視覺結構特性來自我監督的類神經網路模型。目前已知的去煙方式主要分為規則式(prior-based)與學習式(learning-based),這兩種方法都把手術煙霧去除視為一種霧霾去除問題,而沒有考慮到手術煙霧與霧霾的表現十分不同。為了克服這個問題,我們提出一個以CycleGAN為基底的類神經網路並設計一個子模型: Maximum Random Crop (MRC)單元,它的目的是預測手術煙霧的分佈。為了評估我們提出的去煙模型,我們使用三種數值評估方式,深度的平均絕對誤差、邊緣發現指標和特徵點發現指標。實驗結果顯示我們提出的去煙方式不僅能去除圖像中的煙霧還可以保留圖像中的特徵與圖像的顏色,讓去煙後的影像不會導致錯誤的深度預測。使利用我們的循環生成對抗去煙模型處理後的深度預測不會因為去煙後而結果產生錯誤。
Depth information is essential to intelligent surgical auxiliary systems. It offers 3D information to support those algorithms used in systems. Learningbased methods have excellent performance on depth estimation in a single image. However, using learning-based methods on depth estimation has two significant challenges. One is that most of them need the ground truth of depth for supervised training, which is challenging to acquire for endoscopic surgery; another is the smoke produced by surgical tools impacts the depth estimation results. In this work, we achieve depth estimation by designing a self-supervised network that exploits the structure relationships between pairs of stereo images.
The smoke produced by surgical tools reduces the visibility and raises errors in the computer vision algorithms used in intelligent surgical auxiliary systems. Existing methods toward this problem mainly adopt either priorbased methods or learning-based methods. Both refer the surgical smoke removal problem to a fog/haze removal problem, which does not consider the significantly different properties between fog/haze and surgical smoke. To overcome this difference, we propose an end-to-end network based on CycleGAN and introduce a submodule called the Maximum Random Crop (MRC) unit, aiming to estimate the distribution of the surgical smoke.
To evaluate the proposed method’s performance, we use four evaluation metrics: SNR, MAE of depth, boundary awareness, and feature point awareness. The experimental results show that the proposed method can remove the surgical smoke and preserve the original smoked images’ features and intensity.
Acknowledgments i
摘要ii
Abstract iii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background and Related Works 7
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Stereo Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Generative Adversarial Network (GAN) . . . . . . . . . . . . . . 8
2.1.3 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Depth Estimation and Monocular Depth Estimation . . . . . . . . 9
2.2.2 Fog/Haze Removal . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 System Architecture and Problem Definition 12
3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.1 Smoke Removal . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Monocular Depth Estimation . . . . . . . . . . . . . . . . . . . . 13
4 Design and Implementation 14
4.1 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Basic Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Depth Estimation Network . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3.2 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.3 Implementation Detail . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.4 Post Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Smoke Removal Network . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.2 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4.3 Implementation Detail . . . . . . . . . . . . . . . . . . . . . . . 24
5 Performance Evaluation 26
5.1 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6 Conclusion 33
Bibliography 34
K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, 2011.
E. S. Ho, J. C. Chan, D. C. Chan, H. P. Shum, Y. ming Cheung, and P. C. Yuen, “Improving posture classification accuracy for depth sensor-based human activity monitoring in smart environments,” Computer Vision and Image Understanding, vol. 148, pp. 97 – 110, 2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1077314216000138
Y. Shin, Y. S. Park, and A. Kim, “Direct visual slam using sparse depth for cameralidar system,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 5144–5151.
D. Honegger, T. Sattler, and M. Pollefeys, “Embedded real-time multi-baseline stereo,” in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 5245–5250.
K. McGuire, G. de Croon, C. De Wagter, K. Tuyls, and H. Kappen, “Efficient optical flow and stereo vision for velocity estimation and obstacle avoidance on an autonomous pocket drone,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 1070–1076, 2017.
S. Ramos, S. Gehrig, P. Pinggera, U. Franke, and C. Rother, “Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling,” in 2017 IEEE Intelligent Vehicles Symposium (IV), 2017, pp. 1025–1032.
S. Vedula, S. Baker, P. Rander, R. Collins, and T. Kanade, “Three-dimensional scene flow,” in Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, 1999, pp. 722–729 vol.2.
D. Eigen and R. Fergus, “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” in The IEEE International Conference on Computer Vision (ICCV), December 2015.
D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 2366–2374. [Online]. Available: http://papers.nips.cc/paper/5539-depth-map-prediction-froma-single-image-using-a-multi-scale-deep-network.pdf
B. Li, C. Shen, Y. Dai, A. van den Hengel, and M. He, “Depth and surface normal estimation from monocular images using regression on deep features and hierarchical crfs,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
J. Xie, R. Girshick, and A. Farhadi, “Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 842–857.
R. Garg, V. K. B.G., G. Carneiro, and I. Reid, “Unsupervised cnn for single view depth estimation: Geometry to the rescue,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 740–756.
C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
S. G. Narasimhan and S. K. Nayar, “Chromatic framework for vision in bad weather,” in Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662), vol. 1, 2000, pp. 598–605 vol.1.
R. Fattal, “Single image dehazing,” ACM Trans. Graph., vol. 27, no. 3, p. 1–9, Aug. 2008. [Online]. Available: https://doi.org/10.1145/1360612.1360671
B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, “Aod-net: All-in-one dehazing network,” in The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
R. Li, J. Pan, Z. Li, and J. Tang, “Single image dehazing via conditional generative adversarial network,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in The IEEE International Conference on Computer Vision (ICCV), Oct 2017.
J. Žbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” J. Mach. Learn. Res., vol. 17, no. 1, p. 2287–2318, Jan. 2016.
S. Gidaris and N. Komodakis, “Detect, replace, refine: Deep structured prediction for pixel wise labeling,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T Brox,“A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 2672–2680. [Online]. Available: http://papers.nips.cc/paper/5423-generativeadversarial-nets.pdf
T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, “Unsupervised learning of depth and ego-motion from video,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Z. Yin and J. Shi, “Geonet: Unsupervised learning of dense depth, optical flow and camera pose,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
F. Liu, C. Shen, and G. Lin, “Deep convolutional neural fields for depth estimation from a single image,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
Y. Luo, J. Ren, M. Lin, J. Pang, W. Sun, H. Li, and L. Lin, “Single view stereo matching,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
O. Sidorov, C. Wang, and F. A. Cheikh, “Generative smoke removal,” CoRR, vol.abs/1902.00311, 2019. [Online]. Available: http://arxiv.org/abs/1902.00311
M. Ye, E. Johns, A. Handa, L. Zhang, P. Pratt, and G. Yang, “Selfsupervised siamese learning on stereo image pairs for depth estimation in robotic surgery,” CoRR, vol. abs/1705.08260, 2017. [Online]. Available:http://arxiv.org/abs/1705.08260
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, Eds. Cham: Springer International Publishing, 2015, pp. 234–241.
A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts,”Distill, 2016. [Online]. Available: http://distill.pub/2016/deconv-checkerboard
P. Heise, S. Klose, B. Jensen, and A. Knoll, “Pm-huber: Patchmatch with huber regularization for stereo matching,” in The IEEE International Conference on Computer Vision (ICCV), December 2013.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top