跳到主要內容

臺灣博碩士論文加值系統

(44.212.99.208) 您好!臺灣時間:2024/04/17 17:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:鄭丞甫
研究生(外文):CHENG, CHEN-PU
論文名稱:以資料擴增方法提升AI模型多視角泛用性之研究與分析
論文名稱(外文):Research and analysis on improving the multi-perspective versatility of AI models with data augmentation methods
指導教授:陳彥霖陳彥霖引用關係
指導教授(外文):CHEN, YEN-LIN
口試委員:陳彥霖范育成蔣欣翰黃志勝
口試委員(外文):CHEN, YEN-LINFAN, YU-CHENGCHIANG, HSIN-HANHUANG, CHIH-SHENG
口試日期:2022-07-26
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:66
中文關鍵詞:機器學習視野合成資料擴增
外文關鍵詞:Machine learningView synthesisData augmentation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:247
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
摘 要 i
ABSTRACT ii
誌謝 iv
目錄 v
表目錄 vii
圖目錄 viii
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 論文架構 3
第二章 文獻回顧 4
2.1 2.5D視野合成方法 4
2.2 影像資訊編碼方法及生成模型 7
2.2.1 AutoEncoder 7
2.2.2 StyleFlow 9
2.3 交通物件辨識方法 12
2.3.1 YOLOv4 12
2.3.2 YOLACT 16
第三章 研究方法 18
3.1 2.5D視野合成方法 18
3.1.1 提取影像隱碼 19
3.1.2 融入角度資訊 21
3.2 物件重新標記 25
3.2.1 自動物件標記方法 26
3.2.2 標記結果後處理 27
3.3 資料擴增系統流程 29
第四章 實驗結果與分析 31
4.1 實驗環境 31
4.2 資料集介紹與說明 31
4.3 2.5D視野合成方法結果評估 35
4.4 系統實驗結果評估 48
4.4.1 自行收集資料集實驗結果評估 50
4.4.2 公開資料集實驗結果評估 56
第五章 結論與未來工作 62
5.1 結論 62
5.2 未來工作 62
參考文獻 63
[1] 臺北市政府警察局文山第二分局, “遊覽車撞自行車釀死亡車禍 警方呼籲應注意大型車視線死角”, Nov. 13, 2019. Accessed on: July 20, 2022. [Online]. Available: https://police.gov.taipei/News_Content.aspx?n=471D7CA98EADC7B6&sms=72544237BBE4C5F6&s=D9C49A543F472AB8&ccms_cs=1
[2] 游鎧丞, “有畫面講話才大聲!行車記錄器購買也有眉角要注意”, ETtoday新聞雲, Feb. 14, 2021. Accessed on: July 20, 2022. [Online]. Available: https://speed.ettoday.net/news/1894678
[3] 黃瀞瑩, 鍾尹倫, “6/1新規!未裝「行車視野輔助」 最高罰24000”, yahoo!新聞, June 1, 2021. Accessed on: July 20, 2022. [Online]. Available: https://tw.news.yahoo.com/6-1%E6%96%B0%E8%A6%8F-%E6%9C%AA%E8%A3%9D-%E8%A1%8C%E8%BB%8A%E8%A6%96%E9%87%8E%E8%BC%94%E5%8A%A9-%E6%9C%80%E9%AB%98%E7%BD%B024000-042903745.html
[4] M. -L. Shih, S. -Y. Su, J. Kopf and J. -B. Huang, "3D Photography Using Context-Aware Layered Depth Inpainting," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8025-8035, doi: 10.1109/CVPR42600.2020.00805.
[5] R. Tucker and N. Snavely, "Single-View View Synthesis With Multiplane Images," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 548-557, doi: 10.1109/CVPR42600.2020.00063.
[6] R. Rombach, P. Esser and B. Ommer, "Geometry-Free View Synthesis: Transformers and no 3D Priors," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 14336-14346, doi: 10.1109/ICCV48922.2021.01409.
[7] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi and M. Ebrahimi, "EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning", 2019, arXiv:1901.00212 [cs.CV]
[8] G. Liu, F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao and B. Catanzaro, "Image Inpainting for Irregular Holes Using Partial Convolutions", 2018, arXiv:1804.07723 [cs.CV]
[9] R. G. de Albuquerque Azevedo and G. F. Lima, "A graphics composition architecture for multimedia applications based on layered-depth-image," 2016 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2016, pp. 1-4, doi: 10.1109/3DTV.2016.7548882.
[10] W. E. Lorensen and H. E. Cline, "Marching cubes: A high resolution 3D surface construction algorithm," ACM siggraph computer graphics, vol. 21, no. 4, pp. 163–169, 1987.
[11] Wikipedia , "Polygon mesh", July 8, 2022. Accessed on: July 20, 2022. [Oneline]. Available: https://en.wikipedia.org/wiki/Polygon_mesh
[12] Richard Wright , "Lesson 21 - Orthographic Projections", 2017. Accessed on: July 20, 2022. [Oneline]. Available: https://www.geofx.com/graphics/nehe-three-js/lessons17-24/lesson21/lesson21.html
[13] M. Mehralian and B. Karasfi, "RDCGAN: Unsupervised Representation Learning With Regularized Deep Convolutional Generative Adversarial Networks," 2018 9th Conference on Artificial Intelligence and Robotics and 2nd Asia-Pacific International Symposium, 2018, pp. 31-38, doi: 10.1109/AIAR.2018.8769811.
[14] D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes”, 2013, arXiv:1312.6114 [stat.ML]
[15] R. Abdal, P. Zhu, N. Mitra and P. Wonka, "StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows", 2020, arXiv:2008.02401 [cs.CV]
[16] W. Grathwohl, R. T. Q. Chen, J. Bettencourt, I. Sutskever and D. Duvenaud, “FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models”, 2018, arXiv:1810.01367 [cs.LG]
[17] E. Richardson et al., "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 2287-2296, doi: 10.1109/CVPR46437.2021.00232.
[18] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen and T. Aila, "Analyzing and Improving the Image Quality of StyleGAN," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8107-8116, doi: 10.1109/CVPR42600.2020.00813.
[19] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580-587, doi: 10.1109/CVPR.2014.81.
[20] S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, 2015, arXiv:1506.01497 [cs.CV]
[21] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.
[22] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu and A. C. Berg, "SSD: Single Shot Multibox Detector," in European conference on computer vision, 2016, pp. 21-37: Springer.
[23] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection”, 2020, arXiv:2004.10934 [cs.CV]
[24] D. Bolya, C. Zhou, F. Xiao and Y. J. Lee, "YOLACT: Real-Time Instance Segmentation," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9156-9165, doi: 10.1109/ICCV.2019.00925.
[25] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015, pp. 234-241: Springer.
[26] R. Zhang, P. Isola, A. A. Efros, E. Shechtman and O. Wang, "The Unreasonable Effectiveness of Deep Features as a Perceptual Metric," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 586-595, doi: 10.1109/CVPR.2018.00068.
[27] M. Cordts et al., "The Cityscapes Dataset for Semantic Urban Scene Understanding," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 3213-3223, doi: 10.1109/CVPR.2016.350.
[28] C.-Y. Wang, A. Bochkovskiy and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors”, 2022, arXiv:2207.02696 [cs.CV]
電子全文 電子全文(網際網路公開日期:20240831)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊