

( 您好!臺灣時間:2024/12/09 00:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Meng-Luen Wu
論文名稱(外文):On Professional Contemporary Style Photographing Instruction Based on Neural Tree Based Classifiers Applied to Image Aesthetics Assessment
指導教授(外文):Chin-Shyurng Fahn
口試委員(外文):Chu-Song ChenZen-Chung ShihTong-Yee LeeJung-Hua WangHuei-Wen FerngJen-Wei HsiehChin-Shyurng Fahn
外文關鍵詞:Computational aestheticsdata miningdecision treerandom forestartificial neural networks
  • 被引用被引用:0
  • 點閱點閱:274
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
In this dissertation, we study on how to use artificial intelligence and data mining technologies to make computers able to perceive the concept of beauty, which is an abstract idea, and design a photographing instruction system accordingly. We collect contemporary style images captured in recent years on social networks for analysis. In our instruction system, there are two parts of instruction, one is image characteristics, and the other is image composition. The image characteristics refers to the color and textures, while the image composition refers to the structure of an image.
Our proposed photographing instructor is composed of tree-based classifiers and artificial neural networks, and form a random forest to predict whether an image meets the criterions of the contemporary style. Binary decision tree are built for photographing instruction. However, the decision tree suffers from axis-aligned problem, which limits its accuracy. Therefore, we combine the decision tree and neural network, and use the subsets to build multiple random trees as random forest to improve the accuracy. We also described about the limitations of the instruction system. The system gives semantic sentences to users for image characteristics enhancement, and use blocks to indicate which regions should be improved for image composition.
In the experiments, we predict whether an image is favorable. When using image characteristics and composition features separately, and achieved 85% accuracy. When combining the two types of features, the accuracy was above 91%. In addition, the proposed instruction system is able to give correct suggestions. After applying the suggestions from our proposed system, the colors were more harmonized, the compositions were more balanced, and the main subjects were enhanced.
指導教授推薦書 i
口試委員審定書 ii
中文摘要 iii
Abstract iv
致謝 v
Contents vi
List of Figures x
List of Tables xix
Chapter 1 Introduction 1
1.1 Overview 1
1.1.1 Image Characteristics 3
1.1.2 Image Composition 3
1.2 Motivation 4
1.3 Challenges of Image Aesthetics 5
1.4 The Aims and Goals of the Dissertation 7
1.5 System Description 9
1.6 Dissertation Organization 12
Chapter 2 Literature Review 13
2.1 Assessment by Image Characteristics 13
2.2 Assessment by Tags 14
2.3 Photo Ranking Systems 16
2.4 Interestingness 17
2.5 Image Aesthetical Composition 19
2.5.1 Image Aesthetical Composition Classification 19
2.5.2 Image Aesthetical Composition Optimization 21
2.6 Robotic Photographer 22
2.7 Convolution Neural Networks 24
2.8 Automatic Photographing Instructions 27
Chapter 3 Image Characteristics Features 28
3.1 Color Space Conversion 28
3.1.1 sRGB to CIEXYZ 30
3.1.2 RGB to sRGB 30
3.1.3 CIEXYZ to CIELab 31
3.2 Color Components 31
3.2.1 Average Color Extraction 33
3.2.2 Representative Color Extraction 34
3.2.3 Degree of Achromatic 35
3.3 Degree of Brightness 37
3.4 Degree of Contrast 39
3.4.1 Achromatic Contrast 40
3.4.2 Color Contrast 41
3.5 Saturation 41
3.6 Sharpness 42
3.6.1 Maximum Sharpness 43
3.6.2 Sharpness Blur Ratio 45
3.7 Overexposure and Underexposure 47
3.8 Face Detection 49
3.8.1 Haar-like features 49
3.8.2 AdaBoost 50
3.9 Feature Analysis 53
3.9.1 Histogram Analysis 53
3.9.2 Feature Selection 56
3.10 Score Prediction Using Characteristic Features 57
3.11 Summary 60
Chapter 4 Image Composition 61
4.1 Importance of Image Aesthetical Composition 61
4.2 Common Types of Image Composition 62
4.2.1 Central Composition 62
4.2.2 Diagonal Composition 62
4.2.3 Horizontal Composition 63
4.2.4 Perspective Composition 63
4.2.5 Symmetrical Composition 63
4.2.6 Rule-of-thirds Composition 64
4.2.7 Vertical Composition 64
4.3 Saliency Map Generation 65
4.4 Prominent Lines 66
4.5 Composition Recognition Using Defined Rules 67
4.5.1 The Ratio of Horizontal Lines 68
4.5.2 The Ratio of Vertical Lines 69
4.5.3 The Ratio of Diagonal Lines 70
4.5.4 The Ratio of Diagonal Lines 70
4.6 Salient Region Based Rules 70
4.6.1 Nearest Distance between the Center Point of the Photo and the Centroid of the Maximum Saliency Region 71
4.6.2 Nearest Distance between Each of Power Point of the Photo and the Centroid of the Saliency Map 71
4.6.3 Feature Matching for Symmetry Detection 72
4.6.4 Local Area Relation Rules 72
4.6.5 Horizontal Linearity 73
4.6.6 Vertical Linearity 74
4.7 Image Pyramid Analysis 74
4.8 Image Composition Classification 77
4.9 Image Score Prediction Using Composition 84
4.9.1 Histogram Analysis 84
4.9.2 Prediction Using Compositions as Components 85
4.10 Summary 87
Chapter 5 Aesthetics Score Prediction 89
5.1 Data Preprocessing 89
5.1.1 Feature Selection 89
5.1.2 Feature Extraction 91
5.2 Clustering 92
5.3 Multilayer Perceptron 94
5.4 Decision Trees 98
5.5 Neural Decision Trees 101
5.5.1 Shortcoming of Decision Trees 101
5.5.2 Principal Component Analysis for Decision Trees 102
5.5.3 Support Vector Machines in Decision Tree 103
5.5.4 Artificial Neural Networks in Decision Tree 103
5.6 Random Forest 104
5.6.1 Neural Networks in Random Forest 106
5.7 Summary 106
Chapter 6 Photographing Instructions 107
6.1 Basic Instruction Mechanism 110
6.2 Limitations for Improvement Suggestion 111
6.3 Independency between Image Aesthetical Features 113
6.4 Possible Instructions 117
6.4.1 Gamma correction 119
6.4.2 Gaussian blur 120
6.4.3 Sharpness 121
6.4.4 Contrast 121
6.4.5 Other enhancement features 122
6.5 Approximation method 123
Chapter 7 Experimental Results 125
7.1 Crowd-sourcing from DPChallenge 125
7.2 Experimental Setup 129
7.3 Results in Image Aesthetics Score Prediction 131
7.4 Results in Image Composition Recognition 134
7.5 Results in Composition Suggestions 137
7.5.1 Image Style Suggestions 137
7.5.2 Image Composition Suggestions 140
Chapter 8 Conclusion and Future Works 143
8.1 Conclusion 143
8.2 Future Work 145
References 146
[1] M. J. Huiskes, T. Bart, and S. L. Michael, “New Trends and Ideas in Visual Concept Detection,” in Proceedings of the International Conference on Multimedia Information Retrieval, Philadelphia, PA, Mar. 2010, pp. 527-536.
[2] R. Datta, L. Jia, J. Z. Wang, “Algorithmic inferencing of aesthetics and emotion in natural images: An exposition,” in Proceedings of International Conference on Image Processing, San Diego, CA, Oct. 2008, pp. 105-108.
[3] D. Joshi, R. Datta, E. Fedorovskaya, Q. T. Luong, J. Z. Wang, J. Li, and J. Luo, “Aesthetics and emotions in images,” IEEE Signal Processing Magazine, vol. 28, no. 5, pp. 94-115, 2011.
[4] R. Datta, L. Jia, J. Z. Wang, “Algorithmic inferencing of aesthetics and emotion in natural images: An exposition,” in Proceedings of International Conference on Image Processing, San Diego, CA, Oct. 2008, pp. 105-108.
[5] B. Zhang, C. Quan, and F. Ren, "Study on CNN in the recognition of emotion in audio and images," in Proceedings of International Conference on Computer and Information Science (ICIS), Okayama, Japan, 2016, pp. 1-5. doi: 10.1109/ICIS.2016.7550778
[6] A.E. Savakis, S. P. Etz, and A. C. Loui, “Evaluation of image appeal in consumer photography,” in Proceedings of Human Vision and Electronic Imaging, San Jose, CA, Jan. 2000, pp. 111-120.
[7] P. Obrador, L. Schmidt-Hackenberg, and N. Oliver, “The role of image composition in image aesthetics,” in Proceedings of International Conference on Image Processing, Hong Kong, China, Sep. 2010, pp. 3185-3188.
[8] R. Datta, D. Joshi, J. Li, and J. Wang, “Studying aesthetics in photographic images using a computational approach,” in Proceedings of European Conference on Computer Vision (ECCV), Graz, Austria, 2006, pp. 288-301.
[9] Y. Ke, X. Tang, and F. Jing, “The Design of High-level Features for Photo Quality Assessment,” in Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), New York, NY, Jun. 2006, pp. 419-426.
[10] K. Y. Lo, K. H. Liu, and C. S. Chen, “Intelligent Photographing Interface with On-device Aesthetic Quality Assessment,” in Proceedings of Asian Conference on Computer Vision, Daejeon, Korea, Nov. 2012, pp. pp. 533-544.
[11] M. J. Huiskes, B. Thomee, and M. S. Lew, "New Trends and Ideas in Visual Concept Detection", in Proceedings of ACM SIGMM International Conference on Multimedia Information Retrieval, Philadelphia, PA, Mar. 2010, pp. 527-536.
[12] C. H. Yeh, Y. C. Ho, and B. A. Barsky, “Personalized photograph ranking and selection system,” in Proceedings of the International Conference on Multimedia, Firenze, Italy, Oct. 2010, pp. 211-220.
[13] K. Park, S. Hong, M. Baek, and B. Han, “Personalized Image Aesthetic Quality Assessment by Joint Regression and Ranking,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, Mar. 2017, pp. 1206-1214.
[14] L. Huang, T. Xia, J. Wan, Y. Zhang, and S. Lin, “Personalized portraits ranking,” in Proceedings of the International Conference on Multimedia, Scottsdale, Arizona, Nov. 2011, pp. 1277-1280.
[15] S. Dhar, O. Vicente, and L. B. Tamara, “High level describable attributes for predicting aesthetics and interestingness,” in Proceedings of Computer Vision and Pattern Recognition (CVPR), Colorado Springs, Jun. 2011, pp. 1657-1664.
[16] M. Cha, M. Alan, and P. G. Krishna, “A measurement-driven analysis of information propagation in the flickr social network,“ in Proceedings of the International Conference on World Wide Web, Madrid, Spain, Apr. 2009, pp. 721-730.
[17] M. Gygli, H. Grabner, H. Riemenschneider, and H. Nater, “The interestingness of images,” in Proceedings of International Conference on Computer Vision, Sydney, Australia, Dec. 2013, pp. 1633-1640.
[18] L. Mai, H. Le, Y. Niu and F. Liu, "Rule of Thirds Detection from Photograph," in Proceedings of International Symposium on Multimedia, Dana Point CA, Dec. 2011, pp. 91-96.
[19] L. Bai, X. Wang and Y. Chen, “Landscape Image Composition Analysis Based on Image Processing,” in Proceedings of International Conference on Computer Science and Automation Engineering, Beijing, China, May. 2012, pp. 787-790.
[20] J. H. Huang, “A Fuzzy Logic Approach for Recognition of Photographic Compositions,” M.S. Thesis, Dept. Math. Sci., National Chengchi Univ., Taipei, Taiwan, 2007.
[21] Y. T. Lin, “A Photo Composition Classification System Based on Supervised Learning,” M.S. Thesis, Dept. Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, 2013.
[22] L. Liu, R. Chen, L. Wolf, and D. Cohen-Or, “Optimizing photo composition,“ Computer Graphics Forum, vol. 29, no. 2, pp. 469-478. doi: 10.1111/j.1467-8659.2009.01616.x
[23] O. Fried, E. Shechtman, D. B. Goldman, and A. Finkelstein, “Finding distractors in images,” in Proceedings of Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun. 2015, pp. 1703-1712.
[24] Z. Bayes, M. Dixon, K. Goodier, C. M. Grimm, and W. D. Smart, “An autonomous robot photographer,” in Proceedings of International Conference on Intelligent Robots and Systems, Las Vegas, NV, Oct. 2003, pp. 2636-2641.
[25] Z. Bayes, M. Dixon, K. Goodier, C. M. Grimm, and W. D. Smart, “An autonomous robot photographer,” AI Magazine, vol. 23, no. 3, pp. 37, 2004.
[26] Ray, Lawrence A., and Henry Nicponski. "Face detecting camera and method." U.S. Patent No. 6,940,545. 6 Sep. 2005.
[27] C. S. Fahn and M. L. Wu, “Automatic photographing method and system thereof,” U. S. Patent 9 106 838, August 11, 2015.
[28] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Internal Representations by Error Propagation,” Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, MIT Press, MA: Cambridge, vol. 1, pp. 318-362, 1986.
[29] A. Krizhevsky, I. Sutskever, G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in neural information processing systems, pp. 1097-1105, 2012.
[30] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Cambridge, MA: MIT Press, 2016.
[31] X. Lu, Z. Lin, H. Jin, J. Yang, and J. Z. Wang, “Rapid: Rating pictorial aesthetics using deep learning,” In Proceedings of the International conference on Multimedia, Orlando, Florida, Nov. 2014, pp. 457-466.
[32] L. Mai, J. Hailin, and L. Feng, "Composition-preserving deep photo aesthetics assessment." in Proceedings of Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, Jun. 2016, pp. 497-506.
[33] M. L. Wu and C. S. Fahn, “A Decision Tree Based Image Enhancement Instruction System for Producing Contemporary Style Images,” in Proceedings of International Conference on Human-Computer Interaction, Toronto, Canada, Jul. 2016, pp. 80-90.
[34] P. Viola, M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, 2004.
[35] M. Tkalcic and F. T. Jurij, Colour spaces: perceptual, historical and applicational background. New York, NY: IEEE, 2003.
[36] X. Zhang and B. A. Wandell, “A spatial extension of CIELAB for digital color‐image reproduction,” Journal of the Society for Information Display, vol. 5, no. 1, pp. 61-63, 1997.
[37] G. E. Müller, Ueber die Farbenempfindungen: Psychophysische Untersuchungen, Heidelberg, Germany: JA Barth, 1930.
[38] B. J. Calder and A. M. Tybout, “A vision of theory, research, and the future of business schools,” Journal of the Academy of Marketing Science, vol. 27, no. 3, pp. 359-366, 1999.
[39] R. A. Houstoun, "XXXVI. A theory of colour vision," The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 38, no. 225, pp. 402-417, 1919.
[40] T. Young, “The Bakerian Lecture: On the Theory of Light and Colours,” Philosophical Transactions of the Loyal Society of London, vol. 92, pp. 12-48, 1802.
[41] H. Helmholtz, Treatise on Physiological Optics, Mineola, NY: Dover publications, 2013.
[42] L. M. Hurvich and D. Jameson, “An opponent-process theory of color vision,” Psychological review, vol. 64, no.6, pp.384, 1957.
[43] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Statistical Laboratory of the University of California, Berkeley, Jun. 1965, pp. 281-297.
[44] R. M. Evans, “Method for correcting photographic color prints,” U. S. Patent 2 571 697, October 16, 1951.
[45] G. Buchsbaum, “A spatial processor model for object colour perception,” Journal of the Franklin institute, vol. 310, no. 1, pp. 1-26, 1980.
[46] E. H. Land, “The retinex theory of color vision,” Scientific American, pp. 108-129, 1997.
[47] A. A. Michelson, Studies in Optics, Chicago, IL: University of Chicago Press, 1927.
[48] R. C. Gonzalez and R. E. Woods, Digital image processing, 3rd Edition, London, England: Pearson, 2008.
[49] A. Polesel, G. Ramponi, and V. J. Mathews, “Image enhancement via adaptive unsharp masking,” IEEE transactions on image processing, vol. 9, no. 3, pp. 505-510, 2000.
[50] I. Sobel and G. Feldman, “A 3x3 Isotropic Gradient Operator for Image Processing,” Presentation for Stanford Artificial Project, 1968.
[51] S. Suzuki, “Topological structural analysis of digitized binary images by border following,” Computer vision, graphics, and image processing, vol. 30, no. 1, pp. 32-46, 1985.
[52] E. Reinard, W. Heidrich, P. Debevbec, S. Pattanaik, G. Ward, and K. Myszkowski, High dynamic range imaging: acquisition, display, and image-based lighting, Burlington, MA: Morgan Kaufmann, 2010.
[53] C. P. Papageorgiou, M. Oren and T. Poggio, "A general framework for object detection," in Proceedings of Sixth International Conference on Computer Vision, Bombay, India, Jan. 1998, pp. 555-562.
[54] A. Graps, “An Introduction to Wavelets,” IEEE Computational Science & Engineering, vol. 2, no. 2, pp. 50-61, Jun. 1995.
[55] Y. Freund and R. E. Schapire, “A Decision-theoretic Generalization of On-line Learning and an Application to Boosting,” Journal of Computer and System Sciences, vol. 55, no. 119, pp. 1-35, 1995.
[56] Q. J. Ross, C4.5: programs for machine learning, CA: ACM, 1993.
[57] M. A. Hall, “Correlation-based feature selection for machine learning,” Ph.D. dissertation, The University of Waikato, Waikato, New Zealand, 1999.
[58] H. T. Kam, "Random decision forests," in Proceedings of the Third International Conference on Document Analysis and Recognition. vol. 1, pp. 278-282, Montreal, Canada, 1995.
[59] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Representations by Back-propagating Errors,” Cognitive modeling, vol. 5, no. 3, pp. 1, 1998.
[60] C. Cortes, and V. Vapnik. “Support-vector networks.” Machine Learning, vol. 20, no.3, pp. 273-297, 1995.
[61] K. Pearson, “LIII. On lines and planes of closest fit to systems of points in space,” Philosophical Magazine Series 6, vol. 2, no. 11, pp. 559-572, 1901. doi: 10.1080/14786440109462720
[62] H. Abdi, L. J. Williams, “Principal Component Analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 2, no. 4, pp. 433-459, 2010. doi: 10.1002/wics.101
[63] M. M. Cheng, G. X. Zhang, N. J. Mitra, X. Huang and S. M. Hu, “Global Contrast Based Salient Region Detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, 2011, pp. 409-416.
[64] R. O. Duda and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,” Communications of the ACM, vol. 15, no. 1, pp. 11-15, Jan, 1972.
[65] E. Rublee, V. Rabaud, K. Konolige and G. Bradski, “ORB: An Efficient Alternative to SIFT or SURF,” in Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain, pp. 2564-2571, 2011.
[66] I. Sobel and G. Feldman, “A 3x3 Isotropic Gradient Operator for Image Processing,” Presentation for Stanford Artificial Project, 1968.
[67] M. Calonder, V. Lepetit, C. Strecha and P. Fua, “BRIEF: Binary Robust Independent Elementary Features,” in Proceedings of the 11th European Conference on Computer Vision, Heraklion, Crete, Greece, pp. 778-792, 2010.
[68] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381-395, Jun., 1981.
[69] E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and J. M. Ogden, “Pyramid methods in image processing,” RCA engineer, vol. 29, no. 6, pp. 33-41, 1984.
[70] E. J. Candès, and M. B. Wakin, “An introduction to compressive sampling,” IEEE signal processing magazine, vol. 25, no. 2, pp. 21-30, 2008.
[71] G. Freedman, and R. Fattal, “Image and video upscaling from local self-examples,” ACM Transactions on Graphics vol. 30 no. 2, pp.12, 2011.
[72] O. Nobuyuki, “A Threshold Selection Method from Gray-level Histograms,” Automatica, vol. 11, no. 285-296, pp. 23-27, 1975. doi: 10.1109/TSMC.1979.
[73] S. Suzuki, “Topological structural analysis of digitized binary images by border following,” Computer vision, graphics, and image processing, vol. 30, no. 1, pp. 32-46, 1985. doi: 10.1016/0734-189X(85)90016-7
[74] R. Balestriero, “Neural Decision Trees,” arXiv:1702.07360v2 [stat.ML], Feb. 2017.
[75] S. M. Liu, “On SVM Decision-Based Human Activity Recognition Techniques for Single Camera Video,” M. S. Thesis, Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan (R.O.C.), 2011.
[76] G. P. J. Schmitz, C. Aldrich and F. S. Gouws, "ANN-DT: an algorithm for extraction of decision trees from artificial neural networks," IEEE Transactions on Neural Networks, vol. 10, no. 6, pp. 1392-1401, Nov. 1999.
doi: 10.1109/72.809084
[77] E. J. Fortuny, and D. Martens, "Active Learning-Based Pedagogical Rule Extraction", IEEE Transactions on Neural Networks and Learning Systems, vol. 26, pp. 2664-2677, 2015. doi: 10.1109/TNNLS.2015.2389037
[78] V. Ciesielski, B. Perry, and T. Karen, "Finding image features associated with high aesthetic value by machine learning," in Proceedings of the International Conference on Evolutionary and Biologically Inspired Music and Art, Vienna, Austria, Apr. 2013, pp. 47-58.
[79] Y. Luo and X. Tang, “Photo and video quality evaluation: Focusing on the subject,” in Proceedings of the European Conference on Computer Vision, Marseille, France, 2008, pp. 386-399.
[80] C. H. Yeh, W. S. Ng, B. A. Barsky, and O. Ming, “An aesthetics rule-based ranking system for amateur photos,” in Proceedings of ACM SIGGRAPH ASIA Sketches, pp. 24, New Orleans, Louisiana, 2009.
[81] Y. Xu, J. Ratcliff, J. Scovell, G. Speiginer, and R. Azuma, "Real-time Guidance Camera Interface to Enhance Photo Aesthetic Quality." in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2015, pp. 1183-1186.
[82] S. Ma, Y. Fan, and C. W. Chen. "Finding your spot: A photography suggestion system for placing human in the scene." in Proceedings of International Conference on Image Processing, 2014, pp. 556-560.
[83] L. Marchesotti, M. Naila, and P. Florent, "Discovering beautiful attributes for aesthetic image analysis." International journal of computer vision, vol. 113, no. 3, pp. 246-266, 2015.
電子全文 電子全文(網際網路公開日期:20270721)
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
第一頁 上一頁 下一頁 最後一頁 top