跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.83) 您好!臺灣時間:2025/01/25 18:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃芊婷
研究生(外文):HUANG, CIAN-TING
論文名稱:使用U-net深度學習網路在早期胃癌病徵區域偵測之研究
論文名稱(外文):Using U-net Deep Learning Network for Early-stage Gastric Cancer Detection in NBI Images
指導教授:張軒庭張軒庭引用關係
指導教授(外文):CHANG, HSUAN-TING
口試委員:張軒庭郭鐘榮石勝文李宗錞許志仲
口試委員(外文):CHANG, HSUAN-TINGGUO, ZHONG-RONGSHIH, SHENG-WENLEE, TSUNG-CHUNHSU, CHIH-CHUNG
口試日期:2020-06-19
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2020
畢業學年度:108
語文別:中文
論文頁數:82
中文關鍵詞:窄頻成像胃內視鏡影像胃癌深度學習語義分割視覺注意機制金字塔空間池化
外文關鍵詞:Gastric cancerDeep learningU-netMask RCNNNarrow-banding imagingSemantic segmentationNeural networkPyramid spatial poolingVisual attention.
相關次數:
  • 被引用被引用:3
  • 點閱點閱:449
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出了基於深度學習網路方法於窄頻成像的胃內視鏡影像中,執行早期胃癌病變區域的偵測。深度學習網路主要使用Mask RCNN及U-net兩種網路架構,我們將已經標示病變區域的窄頻成像的胃內視鏡影像,透過電腦的學習網路訓練後產生分類器,使可以自動辨識並標示出測試影像中的病變區域。我們也探討由少量的醫療影像,藉由轉移學習或者擴增資料集的方式,提升偵測準確率。我們也對U-net網路模型進行改進,增加視覺注意(Visual attention)機制,抑制連接層中的粗略特徵輸入,與金字塔空間池化(Pyramid spatial pooling, PSP) 模塊,能夠收集不同層級的特徵,改善訓練成果。實驗中我們使用66張彩色訓練影像及60張彩色測試影像,U-net模型與加入PSP模塊後的U-net模型相比,最高精準度(Precision rate)可提升21%。本論文所提出PSP U-net的最高平均精準度可達到90%,最高平均召回率(Recall rate)可達到92%,聯集比(Intersection-over-union, IoU)最高可達60%。
This thesis proposes a method based on the deep learning (DL) networks to identify the early stage gastric cancer lesions in the narrow-band-imaging (NBI) gastroscopic images of the stomach. We mainly use the two DL networks: Mask RCNN and U-net, in our work. The NBI gastroscopic images with marked lesion area are sent to the DL network for the training purpose. Then the classifier can automatically identify and mark the lesion area in the test images. We also discuss how to improve the detection accuracy by using the transferring learning or the data augmentation schemes when only a small number of training images are available. We also improved the U-net with the following two adapting schemes. First, we increase the visual attention mechanism and suppress the rough feature of the input in the connection layers. Second, we use the pyramid spatial pooling (PSP) module to collect different levels of features and this improves the training results. In our experiments, we used 66 and 60 color training and test images, respectively. Compared with the U-net model after adding the PSP module, the U-net model has a 21% increase in the highest precision rate. The highest average precision rate of the PSP U-net proposed in this paper is 86%, the highest average recall rate is 92%, and the highest average intersection-over-union (IoU) can be 58%.
目錄
摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
表目錄 vi
圖目錄 vii
第一章、 緒論 1
1.1 研究動機1
1.2 研究目的 1
1.3 研究方法 1
1.4 論文架構 2
第二章、 相關技術及研究 3
2.1 NBI窄頻成像 3
2.2 胃部內視鏡影像之病變區域判斷 4
2.3 卷積神經網路 7
2.3.1 卷積層 9
2.3.2 池化層 10
2.3.3 反卷積層 11
2.3.4 全連結層 12
2.4 語義分割 13
第三章、 研究方法 15
3.1 系統架構圖流程圖 15
3.2 影像來源及影像標籤 16
3.3 影像前處理 17
3.4 資料擴增 21
3.5 Mask RCNN神經網路 22
3.5.1 實例分割 24
3.5.2 ROI Align 25
3.5.3 Loss Function 26
3.5.4 轉移學習 26
3.6 U-net神經網路 26
3.6.1 U-Net網路架構 26
3.6.2 視覺注意力機制 29
3.6.3 金字塔空間池化模組 30
3.6.4 損失函數介紹 32
第四章、 實驗結果與討論 35
4.1 實驗環境 35
4.2 實驗介紹 36
4.3 實驗參數設置 37
4.4 Mask RCNN病變區域的偵測結果 37
4.5 U-net 病變區域的偵測結果 40
4.6 Attention U-net 病變區域的偵測結果 43
4.7 PSP U-net 病變區域的偵測結果 45
4.8 損失函數於模型之病變區域的偵測結果 53
4.9 實驗結果探討 59
第五章、 結論 63
參考文獻 64


[1]W. H. Organization, “Cancer,” Accessed September 12, 2018.
[2]D. Abdelhafiz, S. Nabavi, R. Ammar, C. Yang, and J. Bi, “Convolutional Neural Network for Automated Mass Segmentation in Mammography,” 2018 IEEE 8th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), Las Vegas, NV, 2018, pp. 1.
[3]K. Sirinukunwattana, S. E Ahmed Raza, Y.W. Tsang, D. R. Snead, I. A. Cree, and N. M. Rajpoot, “Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1196-1206, May 2016.
[4]G. N. van Muijen, D. J. Ruiter, W. W Franke, T. Achtstätter, W. H. Haasnoot, M. Ponec, and S. O. Warnaar, “Cell type heterogeneity of cytokeratin expression in complex epithelia and carcinomas as demonstrated by monoclonal antibodies specific for cytokeratins nos. 4 and 13,” Experimental cell research, vol. 162, no.1, pp. 97-113, 1986.
[5]J. Yang, K. Yu, Y. Gong, and T. Huang, “Linear spatial pyramid matching using sparse coding for image classification,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, 2009, pp. 1794-1801
[6]Z. Gu et al., “CE-Net: Context Encoder Network for 2D Medical Image Segmentation,” IEEE Transactions on Medical Imaging, vol. 38, no. 10, pp. 2281-2292, Oct. 2019.
[7]Y. Yuan, “Automatic skin lesion segmentation with fully convolutional-deconvolutional networks,” arXiv preprint arXiv:1703.05165, 2017.
[8]R. Azad, M. Asadi-Aghbolaghi, M. Fathy, and S. Escalera, “Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions,” 2019 IEEE International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea (South), 2019, pp. 406-415.
[9]C. Mohamed, B. Nsiri, S. Abdelmajid, E. M. Abdelghani, and B. Brahim, "Deep Convolutional Networks for Image Segmentation: Application to Optic Disc detection," 2020 International Conference on Electrical and Information Technologies (ICEIT), Rabat, Morocco, 2020, pp. 1-3.
[10]J. Dolz, K. Gopinath, J. Yuan, H. Lombaert, C. Desrosiers, and I. Ben Ayed, “HyperDense-Net: A Hyper-Densely Connected CNN for Multi-Modal Image Segmentation,” IEEE Transactions on Medical Imaging, vol. 38, no. 5, pp. 1116-1126, May 2019
[11]T. Kanesaka, N. Uedo, K. Yao, Y. Ezoe, H. Doyama, I. Oda, et al., “A significant feature of microvessels in magnifying narrow-band imaging for diagnosis of early gastric cancer,” Endoscopy international open, vol. 3, pp. 590-596, 2015.
[12]T. Ojala, M. Pietikainen, and D. Harwood, “Performance evaluation of texture measures with classification based on Kullback discrimination of distributions,” Proceedings of 12th International Conference on Pattern Recognition, Jerusalem, 1994, pp. 582-585.
[13]N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, 2005, pp. 886-893.
[14]Lowe, and David G. “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[15]G. Preethi and V. Sornagopal, “MRI image classification using GLCM texture features,” 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), Coimbatore, 2014, pp. 1-6.
[16]J. Ren, X. Jiang, and J. Yuan, “Relaxed local ternary pattern for face recognition,” 2013 IEEE International Conference on Image Processing, Melbourne, VIC, 2013, pp. 3680-3684.
[17]A. Barkun, S. Sabbah, R. Enns, D. Armstrong, J. Gregor, et al. “The Canadian Registry on Nonvariceal Upper Gastrointestinal Bleeding and Endoscopy (RUGBE): Endoscopic hemostasis and proton pump inhibition are associated with improved outcomes in a real-life setting,” American Journal of Gastroenterology, vol. 99, no. 7, pp. 1238-1246, 2014.
[18]A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, American, 2012, pp. 1097-1105.
[19]A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “CNN features off-the-shelf: an astounding baseline for recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition workshops, Columbus, 2014, pp. 806-813.
[20]K. Hwang and W. Sung, “Fixed-point feedforward deep neural network design using weights +1, 0, and −1,” 2014 IEEE Workshop on Signal Processing Systems (SiPS), Belfast, 2014, pp. 1-6.
[21]Y. Gao, O. Beijbom, N. Zhang, and T. Darrell, “Compact Bilinear Pooling,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 317-326.
[22]V. Dumoulin, and V. Francesco, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285, 2016.
[23]V.Turchenko, E. Chalmers, and A. Luczak, “A deep convolutional auto-encoder with pooling-unpooling layers in caffe,” arXiv preprint arXiv:1701.04949, 2017.
[24]J. Shotton, M. Johnson, and R. Cipolla, “Semantic texton forests for image categorization and segmentation,” 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, 2008, pp. 1-8.
[25]A. Liaw, and W. Matthew, “Classification and regression by random Forest,” R news, vol. 2, no. 3, pp. 18-22, 2002.
[26]J. Shotton, M. Johnson, and R. Cipolla, “Semantic texton forests for image categorization and segmentation,” 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 2008, pp. 1-8.
[27]S. Hong, J. Oh, H. Lee, and B. Han, “Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016, pp. 3204-3212.
[28]T. Liu and T. Stathaki, “Enhanced pedestrian detection using deep learning based semantic image segmentation,” 2017 22nd International Conference on Digital Signal Processing (DSP), London, 2017, pp. 1-5.
[29]C. Dong, C.C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution,” European conference on computer vision. Springer, Cham, 2014. pp. 184-199.
[30]A. Arnab et al., “Conditional Random Fields Meet Deep Neural Networks for Semantic Segmentation: Combining Probabilistic Graphical Models with Deep Learning for Structured Prediction,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 37-52, Jan. 2018.
[31]Uijlings, Jasper RR, et al. “Selective search for object recognition,” International journal of computer vision, vol. 104, no. 2, pp. 154-171, 2013.
[32]R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2014. pp. 580-587.
[33]R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587.
[34]S. Ren, K. He, R. Girshick, and J. Sun, “Faster RCNN: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, June 2017
[35]K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask RCNN,” 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2980-2988.
[36]J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 3431-3440.
[37]O. Ronneberger, PP. Fischer and T. Brox. “U-net: Convolutional networks for biomedical image segmentation,” International Conference on Medical image computing and computer-assisted intervention, Springer, Cham, 2015. pp. 234-241.
[38]H. Hu, Y. Zheng, Q. Zhou, J. Xiao, S. Chen, and Q. Guan, “MC-Unet: Multi-scale Convolution Unet for Bladder Cancer Cell Segmentation in Phase-Contrast Microscopy Images,” 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, USA, 2019, pp. 1197-1199.
[39]Dubost, Florian, et al. “GP-Unet: Lesion detection from weak labels with a 3D regression network,” International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, Cham, 2017, pp. 214-221.
[40]V. Zyuzin and T. Chumarnaya, “Comparison of Unet architectures for segmentation of the left ventricle endocardial border on two-dimensional ultrasound images,” 2019 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), Yekaterinburg, Russia, 2019, pp. 110-113.
[41]F. Xu, H. Ma, J. Sun, R. Wu, X. Liu, and Y. Kong, “LSTM Multi-modal UNet for Brain Tumor Segmentation,” 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), Xiamen, China, 2019, pp. 236-240.
[42]V. Mnih, N. Heess, and A. Graves, “Recurrent models of visual attention,” Advances in neural information processing systems (NIPS), Montréal, Canada, 2014, pp. 2204-2212.
[43]D. Bahdanau, K. Cho, and Y Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473. 2014.
[44]A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, et al. “Attention is all you need,” Advances in neural information processing systems (NIPS), Long Beach, 2017, pp. 5998-6008.
[45]F. Wang et al., “Residual Attention Network for Image Classification,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 2017, pp. 6450-6458.
[46]H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” arXiv preprint arXiv:1805.08318, 2018.
[47]H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid Scene Parsing Network,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 2017 , pp. 6230-6239.
[48]R. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 6, pp. 1153-1160, December 1981.
[49]C. Cortes, and V. Vapnik, “Support-vector nBradski, Gary, and Adrian Kaehler. Learning OpenCV: Computer vision with the OpenCV library,” O'Reilly Media, Inc.", 2008.etworks." Machine learning, vol. 5, no. 3, pp. 273-297, 1995.
[50]J. Dai, H. Kaiming, and S. Jian, “Instance-aware semantic segmentation via multi-task network cascades,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. pp. 3150-3158.
[51]K. He, G. Gkioxari, PP. Dollár, and R. Girshick, “Mask RCNN,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016, pp. 3150-3158.
[52]P. Yang, Q. Tan, and Y. Ding, “Bayesian Task-Level Transfer Learning for Non-linear Regression,” 2008 International Conference on Computer Science and Software Engineering, Hubei, 2008, pp. 62-65,
[53]Chen, Xinlei, et al. “Microsoft coco captions: Data collection and evaluation server,” arXiv preprint arXiv:1504.00325, 2015.
[54]L. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834-848, April 2018.
[55]O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K., Misawa, B. Glocker, “Attention u-net: Learning where to look for the pancreas.” arXiv preprint arXiv:1804.03999, 2018.
[56]Y. Sai, R. Jinxia and L. Zhongxia, “Learning of Neural Networks Based on Weighted Mean Squares Error Function,” 2009 Second International Symposium on Computational Intelligence and Design, Changsha, 2009, pp. 241-244.
[57]K. K. Darrow, “Entropy,” The Bell System Technical Journal, vol. 21, no. 1, pp. 51-74, June 1942.
[58]F. Milletari, N. Navab, and S. Ahmadi, “V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation,” 2016 Fourth International Conference on 3D Vision (3DV), Stanford, 2016, pp. 565-571.
[59]D. Zhou et al., “IoU Loss for 2D/3D Object Detection,” 2019 International Conference on 3D Vision (3DV), Québec City, QC, Canada, 2019, pp. 85-94.
[60]V. J. Mathews and Z. Xie, “A stochastic gradient adaptive filter with gradient adaptive step size,” IEEE Transactions on Signal Processing, vol. 41, no. 6, pp. 2075-2087, June 19933.
[61]Kingma, P. Diederik, and B. Jimmy “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

電子全文 電子全文(網際網路公開日期:20250708)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊