跳到主要內容

臺灣博碩士論文加值系統

(44.210.85.190) 您好!臺灣時間:2022/12/10 14:21
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林昀宣
研究生(外文):Yun-Hsuan Lin
論文名稱:基於深度學習方法之單張文件影像陰影去除
論文名稱(外文):Deep Learning-based Approach for Single DocumentImage Shadow Removal
指導教授:陳文進陳文進引用關係
指導教授(外文):Wen-Chin Chen
口試委員:陳駿丞林彥宇王鈺強
口試委員(外文):Jun-Cheng ChenYen-Yu LinYu-Chiang Wang
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊網路與多媒體研究所
學門:電算機學門
學類:網路學類
論文種類:學術論文
論文出版年:2019
畢業學年度:108
語文別:英文
論文頁數:41
中文關鍵詞:陰影去除文件影像處理深度學習條件生成對抗式網路
外文關鍵詞:Shadow RemovalDocument Image ProcessingDeep LearningConditional Generative Adversarial Network
DOI:10.6342/NTU201902033
相關次數:
  • 被引用被引用:0
  • 點閱點閱:153
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在本篇論文之中,我們提出一個深度學習模型BEDSR-Net,專門設計為對一般文件影像進行陰影去除。文件通常具有一個共通的全局背景顏色的資訊,因此我們利用深度學習方式使模型學到如何預測整張文件的全局背景顏色資訊。在模型訓練的過程,模型亦同時掌握了文件影像中陰影和非陰影的位置資訊,透過將模型的中間產物特徵圖視覺化以熱度圖方式呈現,此熱度圖可被定位為表達了文件影像中陰影分布的陰影遮罩。透過全局背景顏色以及陰影位置資訊的協助,我們提出的深度學習架構BEDSR-Net將有效對原圖進行陰影去除,且在大部分的評比之中,我們的效果在各方數據均表現優異,整體來說更優於前人的方法。除此之外,BEDSR-Net僅在合成資料集上進行訓練,應用在實際評比用資料集時表現依舊亮眼,這也反映出我們的模型架構對於表現的穩定度上是有明顯的幫助。在本論文中,對於文件影像陰影去除這個任務,我們收集了兩個資料集,分別為合成影像資料集SDSRD以及實際影像資料集DSRD,前者提供了深度學習在這個領域中足夠的訓練資料,並在文件種類和光線複雜度的這兩個面向中達到了足夠的豐富度;後者更涵蓋了大量複雜文件,可作為一個比較模型表現優劣上更泛用的資料集。
In this paper, we propose a novel deep neural network architecture, named BEDSR-Net, which is designed to remove shadow from document images. With our observation that documents usually have single global background color, we utilize deep learning technique to detect the color from a document image. While training process, our model is able to understand the shadow distribution in an image, including intensity and location. We further visualize the knowledge about shadow distribution of our model in the form of heatmap. The heatmap is capable of precisely denoting the shadow location. With the assistance of global background color and the heatmap, our model, BEDSR-Net, achieves state-of-the-art in most evaluation comparison with previous works in the field of document images shadow removal. Also, our model, only trained with a synthetic dataset, still outperforms others in real benchmark datasets, which indeed shows our proposed model''s stability and robustness. Besides, we collect two datasets in this task, including a synthetic dataset (SDSRD) and a real dataset (DSRD). The former one enables the training process of deep learning approach in this task while the latter one can be served as a much more general benchmark dataset. Both SDSRD and DSRD are aimed at capturing more diverse scenario.
誌謝ii
摘要iii
Abstract iv
1 Introduction 1
2 Related Work 3
2.1 Generic images shadow removal . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Document images shadow removal . . . . . . . . . . . . . . . . . . . . . 4
3 The Proposed Approach 5
3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 Mixture model of ST-CGAN and Bako . . . . . . . . . . . . . . 7
3.1.2 ST-CGAN Plus Background Estimation (ST-CGAN-BG) . . . . . 9
3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Background Estimation Document Shadow Removal Network (BEDSR-Net) 12
3.3.1 Global Background Color Estimator Network (GBCE-Net) . . . . 12
3.3.2 Shadow Removal Network(SR-Net) . . . . . . . . . . . . . . . . 15
3.4 Network Architecture and Implementation Details . . . . . . . . . . . . . 17
3.4.1 Global Background Color Estimator Network(GBCE-Net) . . . . 17
3.4.2 Shadow Remover Network(SR-Net) . . . . . . . . . . . . . . . . 17
3.4.3 Training Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Dataset 18
4.1 Document Shadow Removal Dataset (DSRD) . . . . . . . . . . . . . . . 18
4.1.1 Preparation Pipeline . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.2 Properties of DSRD . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Synthetic Document Shadow Removal Dataset (SDSRD) . . . . . . . . . 23
4.3 Summary of Document Shadow Removal Datasets . . . . . . . . . . . . 25
5 Experiment 27
5.1 Evaluation Methods and Metrics . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Pixel-wise comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Visual quality comparison . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.4 Why global background color? . . . . . . . . . . . . . . . . . . . . . . . 31
5.5 Shadow mask vs. heatmap . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Group Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.6.1 Quantitative experiments on group comparison . . . . . . . . . . 34
5.6.2 Qualitative experiments on group comparison . . . . . . . . . . . 36
6 Conclusion 39
Bibliography 40
[1] S. Bako, S. Darabi, E. Shechtman, J. Wang, K. Sunkavalli, and P. Sen. Removing shadows from images of documents. In Asian Conference on Computer Vision, pages 173–183. Springer, 2016.
[2] C. Clausner, A. Antonacopoulos, and S. Pletschacher. Icdar2017 competition on recognition of documents with complex layouts-rdcl2017. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 1, pages 1404–1410. IEEE, 2017.
[3] X. Huang, G. Hua, J. Tumblin, and L. Williams. What characterizes a shadow boundary under the sun and sky? In 2011 International Conference on Computer Vision, pages 898–905. IEEE, 2011.
[4] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
[5] S. Jung, M. A. Hasan, and C. Kim. Water-filling: An efficient algorithm for digitized document shadow removal. In Asian Conference on Computer Vision, pages 398–414. Springer, 2018.
[6] N. Kligler, S. Katz, and A. Tal. Document enhancement using visibility detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2374–2382, 2018.
[7] M. Mirza and S. Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
[8] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[9] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Gradcam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 618–626, 2017.
[10] R. W. Smith. Hybrid page layout analysis via tab-stop detection. In 2009 10th International Conference on Document Analysis and Recognition, pages 241–245. IEEE, 2009.
[11] T. F. Y. Vicente, L. Hou, C.-P. Yu, M. Hoai, and D. Samaras. Large-scale training of shadow detectors with noisily-annotated shadow examples. In European Conference on Computer Vision, pages 816–832. Springer, 2016.
[12] Y. Vicente, F. Tomas, M. Hoai, and D. Samaras. Leave-one-out kernel optimization for shadow detection. In Proceedings of the IEEE International Conference on Computer Vision, pages 3388–3396, 2015.
[13] J. Wang, X. Li, and J. Yang. Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1788–1797, 2018.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top