跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.44) 您好!臺灣時間:2026/01/03 13:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蘇宗敏
研究生(外文):Tzung-Min Su
論文名稱:以二維影像與漸進式相似度外觀圖解法為基礎之穩健三維物體辨識
論文名稱(外文):Robust 3D Object Recognition using 2D Views via an Incremental Similarity-Based Aspect-Graph Approach
指導教授:胡竹生胡竹生引用關係
指導教授(外文):Jwu-Sheng Hu
學位類別:博士
校院名稱:國立交通大學
系所名稱:電機與控制工程系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:96
語文別:英文
論文頁數:84
中文關鍵詞:三維物件辨識人形姿態辨識場景辨識外觀圖解法背景濾除高斯混合模型
外文關鍵詞:3D Object RecognitionHuman Posture RecognitionScene RecognitionAspect-GraphBackground SubtractionGaussian Mixture Model
相關次數:
  • 被引用被引用:0
  • 點閱點閱:386
  • 評分評分:
  • 下載下載:93
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出了一套使用二維影像的穩健三維物體辨識架構。在此架構中包含了兩個主要部份,第一部份是前處理的部份,用來抽取出二維影像中的前景物體,以作為後續的學習與辨識之用。第二部份是一套漸進式資料庫建立方法,利用從不同角度所拍攝到的三維物體之二維影像來建構出該三維物體資料庫,並且能夠利用新拍攝到的二維影像來更新已建構好之三維物體資料庫。
在前處理的部份,我們提出了一套包含強光與陰影濾除的背景濾除架構(BSHSR),使得前景物體在在光影變化與動態背景的影響下,依然能夠精確的被萃取出來。BSHSR中包含了三個模型,分別是以色彩為基礎的機率背景模型(CBM)、以CBM為基礎的梯度機率背景模型(GBM),以及一個圓錐形的光影模型(CSIM)。CBM是利用高斯混合模型(GMM)針對每個像素的像素值作統計所建構出來的模型。而根據CBM,又可以建構出短期背景模型(STCBM)與長期背景模型(LTCBM),接著再利用STCBM與LTCBM建構出GBM。而為了區別前景、強光與陰影的不同,本研究中提出了一建構在RGB色彩空間中且具有動態錐形邊界的CSIM。在漸進式資料庫建立方法的部份,我們提出了一套以相似度外觀圖解法為基礎的學習架構(ISAG)。利用相似度外觀圖解法,每個三維物體在資料庫中均可用一組外觀(aspect)來表示,而每一個外觀則包含了數目不一的二維影像,並且用一個特徵面(characteristic view)來代表。本研究所提出的漸進式資料庫建立方法,目的在於提高屬於同一外觀的二維影像彼此之間的相似度,並且降低各個特徵面彼此之間的相似度。此外,為了模擬人類認知物體的能力,我們採用隨機取樣之角度所拍攝的三維物體之二維影像來做為訓練影像,隨著所收集到的二維影像數目增加,該三維物體的資料庫也會隨之更新。最終,本論文先以實際複雜環境中所拍攝的數段影片之實驗結果來說明所提出的BSHSR之可行性,接著將BSHSR應用於三維物體辨識架構中,以抽取出二維影像中之前景物體。而為了驗證所提出三維物體辨識架構之優越性,我們利用ISAG搭配物體的形狀與色彩特徵,將之應用於三種不同的三維物體之問題,分別是剛體辨識、人形姿態辨識與場景辨識,並根據辨識率結果來說明所提出的三維物體辨識架構之可行性。
This work presents a framework for robust recognizing 3D objects from 2D views. The proposed framework comprises of two stages: the pre-processing stage and the incremental database construction stage. In the pre-processing stage, foreground objects is extracted from 2D views and applied for building 3D database and recognizing. In the incremental database construction stage, a 3D object database is built and updated using 2D views randomly sampled from a viewing sphere.
A background subtraction scheme involving highlight and shadow removal (BSHSR) is proposed as the pre-processing stage of the framework. Foreground regions can be precisely extracted from 2D views using the BSHSR despite illumination variations and dynamic background. The BSHSR comprises three models, called the color-based probabilistic background model (CBM), the gradient-based version of the color-based probabilistic background model (GBM) and a cone-shape illumination model (CSIM). The Gaussian mixture model (GMM) is applied to construct the CBM using pixel statistics. Based on the CBM, the short-term color-based background model (STCBM) and the long-term color-based background model (LTCBM) can be extracted and applied to build the GBM. Furthermore, a new dynamic cone-shape boundary in the RGB color space, called the CSIM, is proposed to distinguish pixels among shadow, highlight and foreground.
An incremental database construction method based on similarity-based aspect-graph (ISAG) is proposed for building the 3D object database using 2D views. Similarity-based aspect-graph, which contains a set of aspects and characteristic views for these aspects, is employed to represent the database of 3D objects. An incremental database construction method that maximizes the similarity of views in the same aspect and minimizes the similarity of prototypes is proposed as the core of the framework. To imitate the ability of human cognition, 2D views randomly sampled from a viewing sphere are applied for building and updating a 3D object database. The effectiveness of the BSHSR is demonstrated via experiments with several video clips collected in a complex indoor environment. The BSHSR is applied in the proposed framework to extract foreground object from 2D views. The proposed framework is evaluated on various 3D object recognition problems, including 3D rigid recognition, human posture recognition, and scene recognition. Shape and color features are employed in different applications with the proposed framework to show the efficiency of the proposed method.
Chapter 1 Introduction 1
1.1 Overview of 3D Object Recognition 1
1.1.1 3D Object Recognition 1
1.1.2 Human Posture Recognition 2
1.1.3 Scene Recognition 3
1.2 Overview of Background Subtraction 4
1.3 Outline of Proposed System 6
1.3.1 Background Subtraction 6
1.3.2 3D Object Recognition 7
1.4 Contribution of this Dissertation 8
1.5 Dissertation Organization 9
Chapter 2 Background Subtraction 10
2.1 Introduction 10
2.2 System Architecture 11
2.3 Background Modeling 12
2.3.1 Color-Based Background Modeling 12
2.3.2 Model Maintenance of the LTCBM and STCBM 15
2.3.3 Gradient-Based Background Modeling 20
2.4 Background Subtraction with Shadow Removal 23
2.4.1 Shadow and Highlight Removal 23
2.4.2 Background Subtraction 27
Chapter 3 Incremental Similarity-Based Aspect-Graph 3D Object Recognition 29
3.1 Introduction 29
3.2 System Architecture 32
3.3 Object Representation 34
3.3.1 Shape Features 34
3.3.2 Color Features 36
3.3.3 Similarity Functions 38
3.3.4 Similarity Measures 38
3.4 Flexible 3D Object Recognition Framework 39
3.4.1 Generation of Aspects and Characteristic Views 40
3.4.2 Object Recognition using 2D Characteristic Views 43
3.4.3 Applications 44
Chapter 4 Experimental Results 47
4.1 BSHSR 47
4.1.1 Local Illumination Changes 48
4.1.2 Global Illumination Changes 53
4.1.3 Foreground Detection 55
4.1.4 Dynamic Background 55
4.1.5 Short-Term Color-based Background Model (STCBM) 58
4.2 3D Object Recognition 59
4.2.1 Rigid Object Recognition 63
4.2.2 Human Posture Recognition 67
4.2.3 Scene Recognition 69
Chapter 5 Conclusions and Future Researches 74
5.1 Conclusions 74
5.2 Future Researches 77
References 79
[1] C.M. Cyr and B. Kimia, “A Similarity-Based Aspect-Graph Approach to 3D Object Recognition,” International Journal of Computer Vision, vol. 57, no. 1, pp. 5–22, 2004.
[2] G. Peters, “Theories of Three-Dimensional Object Perception - A Survey,” Recent Research Developments in Pattern Recognition, Transworld Research Network, 2000.
[3] G. Mamic and M. Bennamoun, “Representation and recognition of 3D free-form objects,” Digital Signal Processing, vol. 12, pp. 47-76, 2002.
[4] R.J. Campbell and P.J. Flynn, “A survey of free-form object representation and recognition techniques,” Computer Vision and Image Understanding, vol. 81, no. 2, pp.166-210, 2001.
[5] A.R. Pope, “Model-based Object Recognition. A Survey of Recent Research,” Technical Report 94–04, Univ. of British Columbia, Jan. 1994.
[6] H. Schneiderm and T. Kanade, “Object Detection using the Statistics of Parts,” International Journal of Computer Vision, vol. 56, no. 3, pp. 151-177, 2004.
[7] S. Ullman, “Three-Dimensional Object Recognition Based on the Combination of Views,” Cognition, vol. 67, no. 1, pp. 21-44, July 1998.
[8] A. Diplaros, T. Gevers, and I. Patras, “Combining Color and Shape Information for Illumination-Viewpoint Invariant Object Recognition,” IEEE Transactions on Image Processing, vol. 15, no.1, pp. 1-11, 2006.
[9] T.K. Kim, J. Kittler, and R. Cipolla, “Discriminative Learning and Recognition of Image Set Classes using Canonical Correlations,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1005-1018, 2007.
[10] Y. Shan, H.S. Sawhney, B. Matei, and R. Kumar, “Shapeme Histogram Projection and Matching for Partial Object Recognition,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 28, no.4, pp. 568-577, 2006.
[11] A.S. Mian, M. Bennamoun, and R.A. Owens, “Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1584-1601, 2006.
[12] A. Thomas, V. Ferrari, B. Leibe, T. Tuytelaars, and B. Schiele, and L. Van Gool, “Towards Multi-View Object Class Detection,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, USA, June, 2006.
[13] T. Serre, L. Wolf, S. Bileschi,M. Riesenhuber, and T. Poggio, “Robust Object Recognition with Cortex-Like Mechanisms,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 411-426, 2007.
[14] S.K. Naik and C.A. Murthy, “Distinct Multicolored Region Descriptor for Object Recognition,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 29, no. 7, pp. 1291-1296, 2007.
[15] A. Diplaros, T. Gevers, and I. Patras, “Color-Shape Context for Object Recognition,” IEEE Workshop on Color and Photometric Methods in Computer Vision (in conjunction with ICCV 2003), Nice, France, Oct. 2003.
[16] S. Abbasi and F. Mokhtarian, “Affine-Similar Shape Retrieval: Application to Multiview 3-D Object Recognition,” IEEE Transactions on Image Processing, vol. 10, no. 1, pp. 131-139, 2001.
[17] C. de Trazegnies, C. Urdiales, A. Bandera, and F. Sandoval, “3D Object Recognition Based on Curvature Information of Planar Views,” Pattern Recognition, vol. 36, no. 11, pp. 2571-2584, Nov. 2003.
[18] C. Dorai and A.K. Jain, “Shape Spectrum Based View Grouping and Matching of 3D Free-Form Objects,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 19, no.10, pp. 1139-1145, 1997.
[19] J. Zhang, X. Zhang, H. Krim, and G.G. Walter, “Object Representation and Recognition in Shape Spaces,” Pattern Recognition, vol. 36, no. 5, pp. 1143-1154, 2003.
[20] V. Blanz, M.J. Tarr, and H.H. Bultho, “What Object Attributes Determine Canonical Views?,” Perception, vol. 28, pp. 575-599, 1999.
[21] I. Weiss and M. Ray, “Model-Based Recognition of 3D Objects from Single Images,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol.23, no.2, pp.116-128, 2001.
[22] S. Kim, G.J. Jang, W.H. Lee, and I.S. Kweon, “Combined Model-Based 3D Object Recognition,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 19, no. 7, pp. 839-852, 2005.
[23] K. Akita, “Image Sequence Analysis of Real World Human Motion,” Pattern Recognition, vol. 17, no.4, pp. 73-83, 1984.
[24] H. Jiang, Z.N. Li, and M.S. Drew, “Recognizing Posture in Pictures with Successive Convexification and Linear Programming,” IEEE. Transactions on Multimedia, vol. 14, no. 6, pp. 26-37, 2007.
[25] R. Cucchiara, C. Grana, A. Prati, and R. Vezzani, “Probabilistic Posture Classification for Human-Behavior Analysis,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 35, no.1, Jan. 2005.
[26] I. Haritaoglu, D. Harwood, and L.S. Davis, “Ghost: A Human Body Part Labeling System using Silhouettes,” in Proceeding of International Conference on Pattern Recognition, pp. 77-82, Aug. 1998.
[27] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-Time Tracking of the Human Body,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 19, no 7, pp. 780-785, 1997.
[28] L.B. Ozer and W. Wolf, “Real-Time Posture and Activity Recognition,” in Proceeding IEEE Workshop, Motion and Video Computing, pp.133–138, Dec. 2002.
[29] L.B. Ozer, T. Lu, and W. Wolf, “Design of A Real-Time Gesture Recognition System: High Performance Through Algorithms and Software,” IEEE Signal Processing Magazine, vol. 22, pp. 57-64, May 2005.
[30] Q. Delamarre and O. Faugeras, “3-D Articulated Models and Multi-View Tracking with Silhouettes,” IEEE Conference on Computer Vision, pp. 716-721, Sept. 1999.
[31] N. Werghi and Y. Xiao, “Recognition of Human Body Posture from a Cloud of 3-D Data Points using Wavelet Transform Coefficients,” in Proceeding of IEEE International Conference on Automatic Face and. Gesture Recognition, pp. 70–75, 2002.
[32] S. Iwasawa, K. Ebihara, J. Ohya, and S. Morishima, “Real-Time Human Posture Estimation using Monocular Thermal Images,” IEEE International Conference on Automatic Face and Gesture Recognition, pp. 492-497, Apr. 1998.
[33] J. Luo and M. Boutell, “Automatic Image Orientation Detection via Confidence-Based Integration of Low- Level and Semantic Cues,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 715-726, 2005.
[34] A.M. Martinez and J. Vitria, “Clustering in Image Space for Place Recognition and Visual Annotations for Human-robot Interaction,” IEEE Transactions on System Man and Cybernetics B, vol. 31, no. 5, pp. 669-682, 2001.
[35] A. Torralba and A. Oliva, “Statistics of Natural Image Categories,” Network: Computation in Neural Systems, vol.14, pp. 391-412, 2003.
[36] R.F. Wang and D.J. Simons, “Active and Passive Scene Recognition across Views,” Cognition, vol. 70, pp. 191-210, 1999.
[37] O.M. Mozos, C. Stachniss, and W. Burgard, “Supervised Learning of Places from Range Data using AdaBoost,” in Proceeding of the IEEE International Conference on Robotics and Automation, ICRA, pp. 1742-1747, Spain, Apr. 2005.
[38] S. Se, D.G. Lowe, and J.J. Little, “Vision-based Global Localization and Mapping for Mobile Robots,” IEEE Transactions on Robotics, vol. 21, no. 3, pp. 364-375, 2005.
[39] L.W. Renninger, and J. Malik, “When is Scene Identification Just Texture Recognition?,” Vision Research, pp. 2301-2311, 2004.
[40] I. Ulrich and I. Nourbakhsh, “Appearance Based Place Recognition for Topological Localization,” in IEEE Conference on Robotics and. Automation, pp. 1023–1029, Nov. 2000.
[41] P. Lamon, A. Tapus, E. Glauser, N. Tomatis, and R. Siegwart, “Environmental Modeling with Fingerprint Sequences for Topological Global Localization,” in IEEE International Conference on Intelligent Robots and Systems, pp. 3781–3786, Oct. 2003.
[42] G.N. Desouza and A.C. Kak, “Vision for Mobile Robot Navigation: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 237-267, 2002.
[43] A. Kosaka and A.C. Kak, “Fast Vision-Guided Mobile Robot Navigation using Model-Based Reasoning and Prediction of Uncertainties,” Computer Vision, Graphic, and Image Processing -- Image Understanding, vol. 56, no. 3, pp.271-329, Nov. 1992.
[44] M. Bessa, A. Coelho, J. Bulas Cruz, and A. Chalmers, “Selective Presentation of Perceptually Important Information to Aid Orientation and Navigation in an Urban Environment,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 20, no. 4, pp. 467-482, 2006.
[45] B.J.A. Kröse, N. Vlassis, R. Bunschoten, and Y. Motomura, “A Probabilistic Model for Appearance-Based Robot Localization,” Image and Vision Computing, vol. 19, pp. 381-391, 2001.
[46] A. Oliva and A. Torralba, “Building the Gist of a Scene: The Role of Global Image Features in Recognition,” Progress in Brain Research: Visual perception, vol. 155, pp. 23-36, 2006.
[47] G. Cicirelli, T. D'Orazio, and A. Distante, “Different Learning Methodologies for Vision-Based Navigation Behaviors,” International Journal of Pattern Recognition and Artificial Intelligence, vol.19, no. 8, pp. 949-975, 2005.
[48] N. Friedman and S. Russell, “Image Segmentation in Video Sequences: A Probabilistic Approach,” in Proceeding Thirteenth Conference Uncertainty in Artificial Intelligence, pp. 175-181, Aug. 1997.
[49] P. Kaew TrakulPong and R. Bowden, “An improved adaptive background mixture model for real-time tracking with shadow detection,” in Proceeding 2nd Europcan Workshop on Advance Video Based Surveillance Systems, Sept. 2001.
[50] C. Stauffer and W.E.L. Grimson,” Adaptive Background Mixture Models for Real-Time Tracking,” in Proceeding IEEE Conference on Computer Vision and Pattern Recognition, pp. 246-252, 1999.
[51] D. Koller, J. Weber, T. Huang, J. Malik, G. Ogasawara, B. Rao, and S. Russell, “Towards Robust Automatic Traffic Scene Analysis in Real-Time,” in Proceeding of the 33rd IEEE Conference on Decision and Control, pp. 3776 -3781, Dec. 1994.
[52] S.S. Huang, L.C. Fu, and P.Y. Hsiao, “Region-Level Motion-Based Background Modeling and Subtraction Using MRFs,” IEEE Transactions on Image Processing, vol. 16, no. 5, pp. 1446-1456, 2007.
[53] A. Elgammal, R. Duraiswami, D. Harwood, and L.S. Davis, “Background and Foreground Modeling using Nonparametric Kernel Density Estimation for Visual Surveillance,” in Proceeding of the IEEE, vol. 90, pp.1151-1163, July 2002.
[54] T.G. Stockham, “Image Processing in the Context of A Visual Model,” in Proceeding of the IEEE, vol. 60 no. 7, pp.828-842, July 1972.
[55] P.L. Rosin and T. Ellis, “Image Difference Threshold Strategies and Shadow Detection,” in Proceeding of the sixth British Machine Vision Conference, pp. 347-356, Sept. 1995.
[56] T.M. Su and J.S. Hu, “Background Removal in Vision Servo System using Gaussian Mixture Model Framework,” in Proceeding of IEEE Conference on Networking, Sensing and Control, pp. 3776 -3781, March 2004.
[57] O. Javed, K. Shafique, and M. Shah., “A Hierarchical Approach to Robust Background Subtraction using Color and Gradient Information,” IEEE Workshop on Motion and Video Computing, Orlando, pp. 22-27, Dec. 2002.
[58] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1-38, 1977.
[59] J.B. MacQueen, “Some Methods for classification and Analysis of Multivariate Observations,” in Proceeding of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.
[60] T. Hoprasert, D. Harwood, and L.S. Davis, “A Statistical Approach for Real-Time Robust Background Subtraction and Shadow Detection,” in Proceeding IEEE International Conference Computer Vision, Frame Rate Workshop, pp. 1-19, Sept. 1999.
[61] S. Dutta Roy, S. Chaudhury, and S. Banerjee, “Aspect Graph Construction with Noisy Feature Detectors,” IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics, vol. 33, no. 2, pp.340 -351, 2003.
[62] I. Chakravarty and H. Freeman, “Characteristic Views as a Basis for Three-Dimensional Object Recognition,” in Proceeding SPIE Conference Robot Vision, vol. 336, pp. 37-45, 1982.
[63] J.J. Koenderink and A.J. van Doorn, “The singularities of the visual mapping,” Biological Cybernetics, vol. 24, pp.51-59, 1976.
[64] I. Shimshoni and J. Ponce, “Finite-Resolution Aspect Graphs of Polyhedral Objects,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, no 4, pp. 315-327, 1997.
[65] J.S. Hu, T.M. Su, and S.C. Jen, “Robust Background Subtraction with Shadow Removal for Indoor Environment Surveillance,” in Proceeding of IEEE IROS, China, Oct. 2006.
[66] C.C. Lin, “Shape Memorization and Recognition of 3-D Objects Using A Similarity-Based Aspect-Graph Approach,” Master Thesis, Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C., June 2005.
[67] J. Canny, “A Computational Approach to Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no.6, pp. 679-698, 1986.
[68] C. Xu and J.L. Prince, “Gradient Vector Flow: A New External Force for Snakes,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 66-71, 1997.
[69] E. Persoon and K.S. Fu, “Shape Discrimination Using Fourier Descriptors,” IEEE Transactions on Systems, Man and Cybernetics, vol. 7, no. 3, pp. 170-179, 1977.
[70] P.C. Lin, “Human Posture Recognition System using 2-D Shape Features,” Master Thesis, Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, R.O.C., June 2006.
[71] J.B. MacQueen, “Some Methods for classification and Analysis of Multivariate Observations,” in Proceeding of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.
[72] Y. Rubner, C. Tomasi, and L.J. Guibas, “The Earth Mover’s Distance as a Metric for Image Retrieval,” International Journal of Computer Vision, vol. 40, no. 2, pp. 99-121, Nov. 2000.
[73] D.M. Sivalingam and N. Pandian, “Minimal Classification Method with Error Correcting Codes for Multi-Class Recognition,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 19, no. 5, pp. 663-680, 2005.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top