|
[1] Wu, Xindong, et al. "Data mining with big data." ieee transactions on knowledge and data engineering 26.1 (2014): 97-107. [2] Labrinidis, Alexandros, and Hosagrahar V. Jagadish. "Challenges and opportunities with big data." Proceedings of the VLDB Endowment 5.12 (2012): 2032-2033. [3] Habteselassie, Biruk. "Application of knowledge discovery in databases: automating manual tasks." (2016). [4] Olvera-López, J. Arturo, et al. "A review of instance selection methods." Artificial Intelligence Review 34.2 (2010): 133-143. [5] Tsai, Chih-Fong, Zong-Yao Chen, and Shih-Wen Ke. "Evolutionary instance selection for text classification." Journal of Systems and Software 90 (2014): 104-113. [6] Buza, Krisztian, Alexandros Nanopoulos, and Lars Schmidt-Thieme. "Insight: efficient and effective instance selection for time-series classification." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer Berlin Heidelberg, 2011. [7] Stojanović, Miloš B., et al. "A methodology for training set instance selection using mutual information in time series prediction." Neurocomputing 141 (2014): 236-245. [8] Gowda, K., and G. Krishna. "The condensed nearest neighbor rule using the concept of mutual nearest neighborhood." IEEE Transactions on Information Theory 25.4 (1979): 488-490. [9] Ritter, G., et al. "An algorithm for a selective nearest neighbor decision rule." IEEE Transactions on Information Theory 21.6 (1975): 665-669. [10] Wilson, Dennis L. "Asymptotic properties of nearest neighbor rules using edited data." IEEE Transactions on Systems, Man, and Cybernetics 2.3 (1972): 408-421. [11] Grochowski, Marek. "Simple incremental instance selection wrapper for classification." International Conference on Artificial Intelligence and Soft Computing. Springer Berlin Heidelberg, 2012. [12] Czarnowski, Ireneusz. "Cluster-based instance selection for machine classification." Knowledge and Information Systems 30.1 (2012): 113-133. [13] Lumini, Alessandra, and Loris Nanni. "A clustering method for automatic biometric template selection." Pattern Recognition 39.3 (2006): 495-497. [14] Caises, Yoel, et al. "SCIS: combining instance selection methods to increase their effectiveness over a wide range of domains." International Conference on Intelligent Data Engineering and Automated Learning. Springer Berlin Heidelberg, 2009. [15] Raicharoen, Thanapant, and Chidchanok Lursinsap. "A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm." Pattern recognition letters 26.10 (2005): 1554-1567. [16] Olvera-López, J., J. Carrasco-Ochoa, and J. Martínez-Trinidad. "Prototype selection via prototype relevance." Progress in Pattern Recognition, Image Analysis and Applications (2008): 153-160. [17] Yarowsky, David. "Unsupervised word sense disambiguation rivaling supervised methods." Proceedings of the 33rd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 1995. [18] Guo, Yuanyuan, Harry Zhang, and Xiaobo Liu. "Instance selection in semi-supervised learning." Canadian Conference on Artificial Intelligence. Springer Berlin Heidelberg, 2011. [19] Blum, Avrim, and Tom Mitchell. "Combining labeled and unlabeled data with co-training." Proceedings of the eleventh annual conference on Computational learning theory. ACM, 1998. [20] Nigam, Kamal, and Rayid Ghani. "Analyzing the effectiveness and applicability of co-training." Proceedings of the ninth international conference on Information and knowledge management. ACM, 2000. [21] Zhou, Zhi-Hua, and Ming Li. "Tri-training: Exploiting unlabeled data using three classifiers." IEEE Transactions on knowledge and Data Engineering 17.11 (2005): 1529-1541. [22] Guo, Tao, and Guiyang Li. "Improved tri-training with unlabeled data." Software Engineering and Knowledge Engineering: Theory and Practice (2012): 139-147. [23] Mucherino, Antonio, Petraq J. Papajorgji, and Panos M. Pardalos. "K-nearest neighbor classification." Data Mining in Agriculture (2009): 83-106. [24] Liaw, Andy, and Matthew Wiener. "Classification and regression by randomForest." R news 2.3 (2002): 18-22. [25] Rish, Irina. "An empirical study of the naive Bayes classifier." IJCAI 2001 workshop on empirical methods in artificial intelligence. Vol. 3. No. 22. IBM New York, 2001. [26] Furey, Terrence S., et al. "Support vector machine classification and validation of cancer tissue samples using microarray expression data." Bioinformatics 16.10 (2000): 906-914. [27] Witten, Ian H., et al. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016. [28] Steinbach, Michael, George Karypis, and Vipin Kumar. "A comparison of document clustering techniques." KDD workshop on text mining. Vol. 400. No. 1. 2000. [29] Jain, Anil K. "Data clustering: 50 years beyond K-means." Pattern recognition letters 31.8 (2010): 651-666. [30] Bouguettaya, Athman, et al. "Efficient agglomerative hierarchical clustering." Expert Systems with Applications 42.5 (2015): 2785-2797. [31] Zhao, Ying, and George Karypis. "Evaluation of hierarchical clustering algorithms for document datasets." Proceedings of the eleventh international conference on Information and knowledge management. ACM, 2002. [32] Silva, Catarina, and Bernardete Ribeiro. "The importance of stop word removal on recall values in text categorization." Neural Networks, 2003. Proceedings of the International Joint Conference on. Vol. 3. IEEE, 2003. [33] Sadeghi, Mohammad, and Jesús Vegas. "Automatic identification of light stop words for Persian information retrieval systems." Journal of Information Science 40.4 (2014): 476-487. [34] Munková, Daša, Michal Munk, and Martin Vozár. "Influence of stop-words removal on sequence patterns identification within comparable corpora." ICT Innovations 2013. Springer International Publishing, 2014. 67-76. [35] Singh, Jasmeet, and Vishal Gupta. "Text Stemming: Approaches, Applications, and Challenges." ACM Computing Surveys (CSUR) 49.3 (2016): 45. [36] Shang, Wenqian, et al. "A novel feature selection algorithm for text categorization." Expert Systems with Applications 33.1 (2007): 1-5. [37] Rogati, Monica, and Yiming Yang. "High-performing feature selection for text classification." Proceedings of the eleventh international conference on Information and knowledge management. ACM, 2002. [38] Yang, Yiming, and Jan O. Pedersen. "A comparative study on feature selection in text categorization." Icml. Vol. 97. 1997.
|