

( 您好!臺灣時間:2025/01/14 07:10
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Kuan-Hung Lin
論文名稱(外文):Boosting Object Retrieval by Estimating Pseudo-Objects
指導教授(外文):Winston H. Hsu
外文關鍵詞:image retrievalobject retrievalpseudo-objectvisual wordlocal feature
  • 被引用被引用:0
  • 點閱點閱:258
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
State-of-the-art object retrieval systems are mostly based on the bag-of-visual-words representation which encodes local appearance information of an image in a feature vector. A search is performed by comparing query object’s feature vector with those for database images. However, a database image vector generally carries mixed information of the entire image which may contain multiple objects and background. Search quality is degraded by such noisy (or diluted) feature vectors. We address this issue by introducing the concept of pseudo-objects to approximate candidate objects in database images. A pseudo-object is a subset of proximate feature points in an image with its own feature vector to represent a local area. We investigate effective methods (e.g., grid, G-means, and GMM-BIC) to estimate pseudo-objects. Experimenting over two consumer photo benchmarks, we demonstrate the proposed method significantly outperforming other state-of-the-art object retrieval algorithms.
摘要 i
Abstract ii
Chapter 1 Introduction 1
1.1 Vector Space Model for Image Retrieval 2
1.2 Object Retrieval 4
1.3 Bag-of-Words Representation 6
1.4 Spatial Pyramid Matching 9
1.5 Limitations of Prior Works 10
Chapter 2 Pseudo-Objects 15
2.1 The Grid Method 16
2.2 The G-means Method 17
2.3 The Gaussian Mixture Model
with Bayesian Information Criterion Method 22
2.4 Image Scoring Based on Pseudo-Objects 25
Chapter 3 Evaluation 27
3.1 Benchmarks 27
3.2 Implementation 30
3.3 Results and Discussion 31
Chapter 4 Conclusions and Future Work 36
Bibliography 37
[1]M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: the qbic system,” Computer, vol. 28, no. 9, pp. 23-32, 1995.
[2]J. R. Smith and S.-F. Chang, “Visualseek: a fully automated content-based image query system,” in Proc. of ACM Multimedia,1996.
[3]J. Sivic and A. Zisserman, “Video Google: a text retrieval approach to object matching in videos,” in Proc. of ICCV, 2003.
[4]J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. of CVPR, 2007.
[5]O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: automatic query expansion with a generative feature model for object retrieval,” in Proc. of ICCV, 2007.
[6]J. Yang, Y. G. Jiang, A. G. Hauptmann, and C. W. Ngo, “Evaluating bag-of-visual-words representations in scene classification,” in Proc. of MIR, 2007.
[7]K. Mikolajczyk and C. Schmid, “Scale & affine invariant interest point detectors,” International Journal of Computer Vision, vol. 60, no. 1, October 2004.
[8]Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. “A comparison of affine region detectors,” International Journal of Computer Vision, vol. 65, no. 1-2, pp. 43-72, 2005.
[9]D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, November 2004.
[10]S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories,” in Proc. of CVPR, 2006.
[11]L. A. Barroso, J. Dean, and U. Holzle, “Web search for a planet: the Google cluster architecture,” IEEE Mirco, vol. 23, no. 2, pp. 22-28, March-April 2003.
[12]O. Chum, J. Matas, and S. Obdrzalek. “Enhancing RANSAC by generalized model optimization,” in Proc. ACCV, 2004.
[13]Y.-H. Yang, P.-T. Wu, C.-W. Lee, K.-H. Lin, W. H. Hsu, and H. H. Chen, “ContextSeer: context search and recommendation at query time for shared consumer photos,” in Proc. of ACM Multimedia, 2008.
[14]G. Hamerly and C. Elkan, “Learning the k in k-means,” in Proc. of NIPS, 2003.
[15]L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall PTR, 1993.
[16]G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, no. 2, pp. 461-464, March 1978.
[17]M. A. Stephens. “EDF statistics for goodness of fit and some comparisons,” Journal of the American Statistical Association, vol. 69, no. 347, pp. 730-737, September 1974
[18]A Dempster, N. Laird, and D. Rubin. “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, Series B, vol. 39, pp. 1-38, 1977.
第一頁 上一頁 下一頁 最後一頁 top