跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.86) 您好!臺灣時間:2025/02/15 09:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃子魁
研究生(外文):Tzu-Kuei Huang
論文名稱:多視角影像之計算攝影學應用
論文名稱(外文):Computational Photography Applications on Multiview Images
指導教授:莊永裕
口試日期:2017-07-28
學位類別:博士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:66
中文關鍵詞:計算攝影學多視角影像虛擬視角影像生成影像形變三維重建影像切割光一致性
外文關鍵詞:computational photographmulti-view imagesview synthesismesh warping3D reconstructionsegmentationphoto-consistency
相關次數:
  • 被引用被引用:1
  • 點閱點閱:363
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
多視角影像會比單張影像提供更多的場景資訊,這些額外的資訊
將有利於各種計算攝影學的應用。本篇論文提出三種多視角影像的應
用。第一個部分我們會介紹如何從兩眼影像/影片去生成虛擬視角的影
像/影片。在裸眼立體顯示器裡,要有效率的顯示立體影像需要同時提
供多個視角影像,然而一般的立體攝影設備通常只能拍兩個視角。傳
統上會利用這兩個影像計算場景深度,然後用基於深度影像生成的方
法來合成其他視角的影像。但是這樣的方法非常依賴深度計算的結果,
然而只用兩張影像算深度是非常困難而且容易出錯的。因此我們提出
基於影像形變的方法來生成虛擬影像,這樣的方法不需計算深度也不
需要使用者介入。首先我們會計算兩張影像間密集的對應特徵點,然
後利用這些特徵點來引導影片的形變,並且運用一些影像特徵來確保
影像變形後還能維持場景的特性。相較於傳統的法,我們提出的方法
可以有效率的產生高品質的多視角影像,而且可以免除惱人的參數設
定。透過我們的方法可以直接將立體相機所拍攝雙眼影像/影片轉換成
能在裸眼立體顯示器上撥放的多視角立體影像/影片。
三維列印已經是重要且普及的工具,因此如何取得使用者需要的三
維模型成為一個重要的課題。利用多張影像重建出好的三維模型是一
種常見取得三維模型的方法。然而現存常見的方法所建出來的三維模
型並不可靠,常常因為各種原因造成重建出來的模型破碎、雜亂。所
以我們提出一個可靠的運用物體輪廓重建三維模型的系統。在本論文
的第二部份將會介紹如何計算各視角影像之間的顏色、位置、外觀以
及立體空間等等的關係,然後利用這些關係設計一個最佳化的方式將
物體及背景有效的切割開來。有了所有視角的分割結果之後,就可以
運用視覺外殼的方法將三維立體模型建構出來。然而,視覺外殼所建
立出來三維模型存在一個最主要的缺陷就是物體凹陷的部分無法被重
建出來。為了修正這樣的問題,在本論文的第三部份提出一個運用各
張影像之間光一致性的方法來重建物體的細節。那些屬於真正物體表
面的區域將被找出來,並修正視覺外殼重建的結果。論文最後我們會
展示利用我們方法所重建的三維模型。
Multi-view images give more useful informations than single image if we can find the correspondence between each view.
It''s means that we can use these additional informations to improve the computational photography applications.
This thesis presents three application on multi-view images.
The first application introduce a warping-based novel view synthesis framework for both binocular stereoscopic images and videos.
Large-size autostereoscopic displays require multiple views while most stereoscopic cameras and digital video recorder can only capture two.
Obtain accurate depth maps from two-view image or video is still difficult and time consuming, but popular novel view synthesis methods, such as depth image based rendering (DIBR), often heavily rely on it.
The proposed framework requires neither depth maps nor user intervention.
Dense and reliable features is extracted and find the correspondences of two-view.
Image warped basis on these correspondences to synthesize novel views while simultaneously maintaining stereoscopic properties ,preserving image structures and keeping temporal coherence on video.
Our method produces higher-quality multi-view images and video more efficiently without tedious parameter tuning.
This is useful to convert stereoscopic images and videos taken by binocular cameras into multi-view images and videos ready to be displayed on autostereoscopic displays.

3D printing has become an important and prevalent tool.
Image-based modeling is a popular way to acquire 3D models for further editing and printing.
However, exiting tools are often not robust enough for users to obtain the 3D models they want.
The constructed models are often incomplete, disjoint and noisy.
Here we proposed a shape from silhouette system to reconstruct 3D models more robustly.
In second part of this thesis introduce a robust automatic method for segmenting an object out of the background using a set of multi-view images.
The segmentation is performed by minimizing an energy function which incorporates color statistics, spatial coherency, appearance proximity, epipolar constraints and back projection consistency of 3D feature points.
It can be efficiently optimized using the min-cut algorithm.
With the segmentation, the visual hull method is applied to reconstruct the 3D model of the object.
However, the primary weakness of this approach is the inability to reproduce the concave region.
To fix this problem, we use the photo-consistency principle of multi-view introduced at the third part of this thesis.
Those voxel belong to the object surface will be found and give a refined result model with more details.
Experiments show that the proposed method can generate better models than some popular systems.
誌謝 iii
摘要 v
Abstract vii
1 Introduction 1
2 Novel View Synthesis from Binocular to Multi-view using a Warping-Based
method 3
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Multi-View Image Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Semi-Dense Stereo Correspondence . . . . . . . . . . . . . . . . 8
2.3.2 Virtual View Generation . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Modified Content-Preserving Warps . . . . . . . . . . . . . . . . 10
2.4 Multi-View Video Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Stereo and Temporal Correspondence . . . . . . . . . . . . . . . 15
2.4.2 Virtual View Generation . . . . . . . . . . . . . . . . . . . . . . 18
2.4.3 Triangular Mesh Warps . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Novel View Image Synthesis Results . . . . . . . . . . . . . . . 20
2.5.2 Novel View Video Synthesis Results . . . . . . . . . . . . . . . . 22
2.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . 23
ix3 A Robust Automatic Object Segmentation Method for 3D Printing 27
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Automatic object segmentation . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2 Energy terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.4 Visual hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Volumetric Graph-cut Model Reconstruction Method for 3D Printing 43
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Volumetric model reconstruction . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Visual hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.2 Surface likelihood term . . . . . . . . . . . . . . . . . . . . . . . 48
4.3.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.4 Surface reconstruction . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Conclusions and limitations . . . . . . . . . . . . . . . . . . . . . . . . . 57
5 Conclusions 59
Bibliography 61
[1] AutoDesk 123D Catch. http://www.123dapp.com/.
[2] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 34(11):2274–2282, Nov. 2012.
[3] B. G. Baumgart. Geometric Modeling for Computer Vision. PhD thesis, Stanford, CA, USA, 1974. AAI7506806.
[4] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346 – 359, 2008.
[5] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(9):1124–1137, Sep. 2004.
[6] G. J. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla. Segmentation and recognition using structure from motion point clouds. In European Conference on Computer Vision (ECCV 2008), pages 44–57, 2008.
[7] N. Campbell, G. Vogiatzis, C. Hernandez, and R. Cipolla. Automatic object segmentation from calibrated images. In Proceedings of Conference for Visual Media Production (CVMP), pages 126–137, Nov. 2011.
[8] C.-H. Chang, C.-K. Liang, and Y.-Y. Chuang. Content-aware display adaptation and interactive editing for stereoscopic images. IEEE Transactions on Multimedia, 13(4):589–601, August 2011.
[9] A. Djelouah, J. S. Franco, E. Boyer, F. L. Clerc, and P. Pérez. Sparse multi-view consistency for object segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 37(9):1890–1903, Sept 2015.
[10] M. Farre, O. Wang, M. Lang, N. Stefanoski, A. Hornung, and A. Smolic. Automatic content creation for multiview autostereoscopic displays using image domain warping. In 2011 IEEE International Conference on Multimedia and Expo, pages 1–6, July 2011.
[11] S. Farsiu, M. D. Robinson, M. Elad, and P. Milanfar. Fast and robust multiframe super resolution. IEEE Transactions on Image Processing, 13(10):1327–1344, Oct
2004.
[12] C. Fehn. A 3D-TV approach using depth-image-based rendering. Proceedings of 3rd IASTED Conference on Visualization, Imaging, and Image Processing, 3:482–487,
2003.
[13] C. Fehn. Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. Stereoscopic Displays and Virtual Reality Systems XI. Proceedings of the SPIE, 5291:93–104, 2004.
[14] S. Fuhrmann and M. Goesele. Floating scale surface reconstruction. ACM Trans. Graph., 33(4):46:1–46:11, July 2014.
[15] S. Fuhrmann, F. Langguth, and M. Goesele. MVE - A Multi-View Reconstruction Environment. In Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage (GCH), 2014.
[16] Y. Furukawa and J. Ponce. Carved visual hulls for image-based modeling. In European Conference on Computer Vision (ECCV 2006), pages 564–577, 2006.
[17] Y. Furukawa and J. Ponce. Carved visual hulls for image-based modeling. International Journal of Computer Vision (IJCV), 81(1):53–67, Jan. 2009.
[18] Y. Furukawa and J. Ponce. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(8):1362–1376, Aug. 2010.
[19] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, second edition, 2004.
[20] T.-K. Huang, Y.-H. Yang, T.-K. Lin, and Y.-Y. Chuang. A robust automatic object segmentation method for 3d printing. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2016), July 2016.
[21] ISO/IEC JTC1/SC29/WG11. View synthesis reference software, May 2009. version 3.0.
[22] ISO/IEC JTC1/SC29/WG11. Depth estimation reference software, July 2010. version 5.0.
[23] M. Kazhdan, M. Bolitho, and H. Hoppe. Poisson surface reconstruction. In Proceedings of Eurographics Symposium on Geometry Processing, pages 61–70, 2006.
[24] M. Kazhdan and H. Hoppe. Screened poisson surface reconstruction. ACM Transactions on Graphics (TOG), 32(3):29:1–29:13, Jul. 2013.
[25] P. Kohli and P. Torr. Efficiently solving dynamic Markov random fields using graph cuts. In IEEE International Conference on Computer Vision (ICCV 2005), volume 2, pages 922–929 Vol. 2, Oct. 2005.
[26] K. Kolev, T. Brox, and D. Cremers. Robust variational segmentation of 3d objects from multiple views. In Pattern Recognition: 28th DAGM Symposium, Berlin, Germany, September 12-14, 2006. Proceedings, pages 688–697, 2006.
[27] K. N. Kutulakos and S. M. Seitz. A theory of shape by space carving. In Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV 1999), volume 1, pages 307–314 vol.1, 1999.
[28] M. Lang, A. Hornung, O. Wang, S. Poulakos, A. Smolic, and M. Gross. Nonlinear disparity mapping for stereoscopic 3D. ACM Trans. Graph., 29(4):75:1–75:10, July 2010.
[29] A. Laurentini. The visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell., 16(2):150–162, 1994.
[30] W. Lee, W. Woo, and E. Boyer. Identifying foreground from multiple images. In Asian Conference on Computer Vision (ACCV 2007), pages 580–589, 2007.
[31] Y. Li, J. Sun, C.-K. Tang, and H.-Y. Shum. Lazy snapping. ACM TOG, 23(3):303–308, Aug. 2004.
[32] F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content-preserving warps for 3d video stabilization. ACM Trans. Graph., 28(3):44:1–44:9, July 2009.
[33] D. G. Lowe. Object recognition from local scale-invariant features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, volume 2, pages
1150–1157 vol.2, 1999.
[34] B. D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on
Artificial Intelligence - Volume 2, IJCAI’81, pages 674–679, 1981.
[35] P. Ondrúška, P. Kohli, and S. Izadi. Mobilefusion: Real-time volumetric surface reconstruction and dense tracking on mobile phones. In Proceedings of IEEE ISMAR, Sep. 2015.
[36] N. Otsu. A Threshold Selection Method from Gray-level Histograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1):62–66, Jan. 1979.
[37] E. Parzen. On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33(3):pp. 1065–1076, 1962.
[38] E. Rosten and T. Drummond. Fusing points and lines for high performance tracking. In IEEE International Conference on Computer Vision, volume 2, pages 1508–1511, October 2005.
[39] E. Rosten and T. Drummond. Machine learning for high-speed corner detection. In European Conference on Computer Vision, volume 1, pages 430–443, May 2006.
[40] C. Rother, V. Kolmogorov, and A. Blake. GrabCut: Interactive foreground extraction using iterated graph cuts. ACM TOG, 23(3):309–314, Aug. 2004.
[41] S. N. Sinha and M. Pollefeys. Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation. In Tenth IEEE International Conference on Computer Vision (ICCV 2005, volume 1, pages 349–356 Vol. 1, Oct 2005.
[42] B. M. Smith, L. Zhang, and H. Jin. Stereo matching with nonparametric smoothness priors in feature space. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 485–492, June 2009.
[43] A. Smolic, P. Kauff, S. Knorr, A. Hornung, M. Kunter, M. Muller, and M. Lang. Three-dimensional video postproduction and processing. Proceedings of the IEEE, 99(4):607–625, April 2011.
[44] N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. In SIGGRAPH Conference Proceedings, pages 835–846, New York, NY, USA, 2006. ACM Press.
[45] M. Sormann, C. Zach, and K. Karner. Graph cut based multiple view segmentation for 3d reconstruction. In 3D Data Processing, Visualization, and Transmission, Third International Symposium on, pages 1085–1092, June 2006.
[46] C. Tomasi and T. Kanade. Detection and tracking of point features. Technical report, International Journal of Computer Vision, 1991.
[47] S. Tran and L. Davis. 3d surface reconstruction using graph cuts with surface constraints. In European Conference on Computer Vision (ECCV 2006), pages 219–231. Springer, 2006.
[48] T. Tuytelaars. Dense interest points. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2281–2288, June 2010.
[49] G. Vogiatzis, C. H. Esteban, P. H. S. Torr, and R. Cipolla. Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. IEEE Transactions
in Pattern Analysis and Machine Intelligence (PAMI), 29(12):2241–2246, 2007.
[50] G. Vogiatzis, P. H. S. Torr, and R. Cipolla. Multi-view stereo via volumetric graph-cuts. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), volume 2, pages 391–398, June 2005.
[51] R. G. von Gioi, J. Jakubowicz, J. M. Morel, and G. Randall. Lsd: A fast line segment detector with a false detection control. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 32(4):722–732, April 2010.
[52] H. Wang, M. Sun, and R. Yang. Space-time light field rendering. IEEE Transactions on Visualization and Computer Graphics, 13(4):697–710, July 2007.
[53] Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee. Optimized scale-and-stretch for image resizing. ACM Trans. Graph., 27(5):118:1–118:8, Dec. 2008.
[54] C. Wu. VisualSFM : A Visual Structure from Motion System. http://ccwu.me/vsfm/.
[55] A. Yezzi and S. Soatto. Stereoscopic segmentation. International Journal of Computer Vision, 53(1):31–43, 2003.
[56] J. Zhang, Y. Cao, Z. Zheng, and Z. Wang. A new closed loop method of super-resolution for multi-view images. pages 154–164, 2013.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top