

( 您好!臺灣時間:2024/09/15 16:21
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Zheng, Jun-Hong
論文名稱(外文):Entropy-driven Optimization of Recommendation Systems through Categorical Feature Engineering
指導教授(外文):Chou, Pei-TingChang, Yu-Wei
口試委員(外文):Leong, Yin-Yee
外文關鍵詞:Categorical variableFeature selectionConditional entropyRecommendation systemMachine learning
  • 被引用被引用:0
  • 點閱點閱:4
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
Feature selection plays a crucial role in machine learning as it helps enhance the accuracy and efficiency of models. Conditional entropy is an index from information theory used to evaluate the relevance of features, considering the conditional relationships between them. This helps in identifying features that are closely related to the target variable. This study aims to explore the application of conditional entropy as a feature selection method in datasets with a large number of categorical variables. Taking the KKbox music dataset as an example, we evaluate the impact on model performance by assessing the feature set selected through conditional entropy in categorical variable. Our experimental results show that we were able to obtain a model with fewer features but still maintaining good performance. This demonstrates that conditional entropy can serve as an effective feature selection method, helping us to discover features closely related to user listening behavior, thereby simplifying large datasets and enhancing the computational efficiency of the model.
第一章 Introduction 1

第二章 Literature Review 6
第一節 Feature Selection 6
第二節 ConditionalEntropy 7
第三節 Music Recommendation System 8

第三章 Methodology 10
第一節 Average of Conditional Entropy Interaction 10
第二節 Singular Value Decomposition 12
第三節 LightGBMModel 13

第四章 Empirical Analysis 16
第一節 Data Description and Preprocessing 16
第二節 Feature Engineering 20
第三節 Model Training and Evaluation Result 24

第五章 Conclusion and Future Improvement 28
第一節 Conclusion 28
第二節 Future Improvement 29

References 30
Addison Howard, Arden Chiu, M. M. m. W. K. Y. (2017). Wsdm - kkbox’s music recommendation challenge.
Chang, Y.-F. (2024). Entropy: A join between science and mind-society. change, 15:29.
Darcy, R. and Aigner, H. (1980). The uses of entropy in the multivariate analysis of categorical variables. American Journal of Political Science, 24(1):155–174.
Hill, W., Stead, L., Rosenstein, M., and Furnas, G. (1995). Recommending and evaluating choices in a virtual community of use. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 194–201.
KBVresearch (2022). Global recommendation engine market size, share industry trends analysis report by type, by application, by deployment type, by organization size, by end use, by regional outlook, strategy, challenges and forecast, 2021 - 2027. https://www.kbvresearch. com/recommendation-engine-market/.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
Klema, V. and Laub, A. (1980). The singular value decomposition: Its computation and some applications. IEEE Transactions on Automatic Control, 25(2):164–176.
Kraskov, A., Stögbauer, H., and Grassberger, P. (2004). Estimating mutual information. Physical review E, 69(6):066138.
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., and Liu, H. (2017). Feature selection: A data perspective. ACM Comput. Surv., 50(6).
Li, Q., Kim, B. M., Guan, D. H., and Oh, D. w. (2004). A music recommender based on audio features. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 532–533.
PyPI (2021). python package index - pypi. https://pypi.org/.
Rosenberg, A. and Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP- CoNLL), pages 410–420.
Song, Y., Dixon, S., and Pearce, M. (2012). A survey of music recommendation systems and future perspectives. In 9th international symposium on computer music modeling and retrieval, volume 4, pages 395–410. Citeseer.
Statista (2021). Volume of data/information created, captured, copied, and consumed world- wide from 2010 to 2020, with forecasts from 2021 to 2025. https://www.statista.com/ statistics/871513/worldwide-data-created/.
Wold, S., Esbensen, K., and Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1):37–52. Proceedings of the Multivariate Statistical Workshop for Geologists and Geochemists.
Zhang, J. and Fogelman-Soulié, F. (2018). Kkbox’s music recommendation challenge solution with feature engineering. In 11th ACM International Conference on Web Search and Data Mining WSDM, pages 1–8.
電子全文 電子全文(網際網路公開日期:20290619)
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
第一頁 上一頁 下一頁 最後一頁 top