跳到主要內容

臺灣博碩士論文加值系統

(35.172.136.29) 您好!臺灣時間:2021/08/02 16:15
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:趙廷軒
研究生(外文):Ting-Hsuan Chao
論文名稱:大規模物件偵測利用正規化稀疏編碼
論文名稱(外文):Scalable Object Detection by Filter Compression with Regularized Sparse Coding
指導教授:徐宏民
指導教授(外文):Winston Hsu
口試委員:陳文進李宏毅孫民
口試委員(外文):Wen-Chin ChenHung-Yi LeeMin Sun
口試日期:2015-07-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:英文
論文頁數:21
中文關鍵詞:大規模物件偵測稀疏編碼
外文關鍵詞:Scalable Object DetctionSparse Coding
相關次數:
  • 被引用被引用:0
  • 點閱點閱:71
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在實際的應用上,一個物件偵測系統需要有能力偵測大量的物件類別才能符合使用者需求,許多成功的物件偵測系統使用了部件模型,針對每個物件類別個別訓練部件模型(分類器)以達成多類別物件偵測系統的需求。但是這些方法有正比於物件類別數量的運算複雜度,將會造成相當長的運算時間,為了解決這個問題,有些研究學習編碼簿使得運算可以直接在編碼簿上進行,使得運算複雜度可以不再正比於物件類別數量,但是這些研究並未考量到分類器的特性:分類器其實是向量支持機的權重,他們把適用於視覺訊號的方法使用在其之上,導致在高加速需求下損失大量準確度。為了解決此問題,我們發展出一個新的方法,名為正規化稀疏編碼,被設計來重建分類器的功能。換句話說,此方法重建了分類器產生精確分類分數的能力。我們的方法可以透過最小化分數誤差來重建分類器,相對於一般的稀疏編碼是透過最小化分類器外表誤差來重建分類器,這樣的策略差別使得我們的方法可以在高加速需求下只損失相當少的準確度。在擁有200個物件類別的ILSVRC2013資料集,我們可以在單一中央處理單元的環境下只使用1.25%的記憶體達到16倍的加速,只損失0.04平均準度均值(相比於原始的可變形部件模型)。除此之外,此方法可以套用在圖像處理器上進行平行運算以達到更高的加速。

For practical applications, an object detection system requires huge number of classes to meet real world needs. Many successful object detection systems use part-based model which trains several filters (classifiers) for each class to perform multiclass object detection. However, these methods have linear computational complexity in regard to the number of classes and may lead to huge computing time. To solve the problem, some works learn a codebook for the filters and conduct operations only on the codebook to make computational complexity sublinear in regard to the number of classes. But the past studies missed to consider filter characteristics, e.g., filters are weights trained by Support Vector Machine, and rather they applied method such as sparse coding for visual signals'' optimization. This misuse results in huge accuracy loss when a large speedup is required. To remedy this shortcoming, we have developed a new method called Regularized Sparse Coding which is designed to reconstruct filter functionality. That is, it reconstructs the ability of filter to produce accurate score for classification. Our method can reconstruct filters by minimizing score map error, while sparse coding reconstructs filters by minimizing appearance error. This different optimization strategy makes our method be able to have small accuracy loss when a large speedup is achieved. On the ILSVRC 2013 dataset, which has 200 classes, this work represents a 16 times speedup using only 1.25% memory on single CPU with 0.04 mAP drop when compared with the original Deformable Part Model. Moreover, parallel computing on GPUs is also applicable for our work to achieve more speedup.

誌謝 i
Acknowledgements ii
摘要 iii
Abstract iv
1 Introduction 1
2 Related Work 5
2.1 Proposal Extraction . . . 5
2.2 Object Classification . . . 6
3 Technical Details 7
3.1 Sparse Coding . . . 7
3.2 Regularized Sparse Coding . . .9
4 Experiment Results 12
4.1 Datasets and Implementation . . . 12
4.2 Performance Analysis . . .13
4.3 Scalability . . . 14
5 Conclusion 17
Bibliography 19

[1] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 73–80. IEEE, 2010.
[2] Q. Chen, Z. Song, R. Feris, A. Datta, L. Cao, Z. Huang, and S. Yan. Efficient maximum appearance search for large-scale object detection. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3190–3197. IEEE, 2013.
[3] T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, and J. Yagnik. Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1814–1821. IEEE, 2013.
[4] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, et al. Least angle regression. The Annals of statistics, 32(2):407–499, 2004.
[5] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
[6] P. F. Felzenszwalb, R. B. Girshick, and D. McAllester. Cascade object detection with deformable part models. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, pages 2241–2248. IEEE, 2010.
[7] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.
[8] R. B. Girshick, P. F. Felzenszwalb, and D. McAllester. Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/ rbg/latent-release5/.
[9] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 689–696. ACM, 2009.
[10] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11:19–60, 2010.
[11] S. G. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. Signal Processing, IEEE Transactions on, 41(12):3397–3415, 1993.
[12] K. Murphy, A. Torralba, D. Eaton, and W. Freeman. Object detection and localization using local and global features. In Toward Category-Level Object Recognition, pages 382–400. Springer, 2006.
[13] M. Pedersoli, A. Vedaldi, and J. Gonzalez. A coarse-to-fine approach for fast deformable object detection. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1353–1360. IEEE, 2011.
[14] H. Pirsiavash and D. Ramanan. Steerable part models. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 3226–3233. IEEE, 2012.
[15] E. Rahtu, J. Kannala, M. Salo, and J. Heikkilä. Segmenting salient objects from images and videos. In Computer Vision–ECCV 2010, pages 366–379. Springer, 2010.
[16] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. Imagenet large scale visual recognition challenge, 2014.
[17] H. O. Song, T. Darrell, and R. B. Girshick. Discriminatively activated sparselets. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), 22pages 196–204, 2013.
[18] H. O. Song, S. Zickler, T. Althoff, R. Girshick, M. Fritz, C. Geyer, P. Felzenszwalb, and T. Darrell. Sparselet models for efficient multiclass object detection. In Computer Vision–ECCV 2012, pages 802–815. Springer, 2012.
[19] A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing features: efficient boosting procedures for multiclass object detection. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, volume 2, pages II–762. IEEE, 2004.
[20] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. International journal of computer vision, 104(2):154–171, 2013.
[21] K. E. Van de Sande, J. R. Uijlings, T. Gevers, and A. W. Smeulders. Segmentation as selective search for object recognition. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 1879–1886. IEEE, 2011.
[22] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In Computer Vision, 2009 IEEE 12th International Conference on, pages 606–613. IEEE, 2009.
[23] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, volume 1, pages I–511. IEEE, 2001.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文