|
[1]http://www.alexa.com/topsites [2]http://royal.pingdom.com/2010/02/10/twitter-now-more-than-1-billion-tweets-per-month/ [3]http://www.pearanalytics.com/blog/wp-content/uploads/2010/05/Twitter-Study-August-2009.pdf [4]http://support.twitter.com/groups/31-twitter-basics/topics/114-guidelines-best-practices/articles/18311-the-twitter-rules [5]N. Kushmerick, “Wrapper Induction: Efficiency and Expressiveness,” Artificial Intelligence, vol. 118, no. 1-2, pp. 15-68, 2000. [6]C.-N Hsu and M-T Dung, “Generating Finite-State Transducers for Semi-Structured Data Extraction from the Web,” Information Systems, vol. 23,no. 8,pp. 521-538,1998 [7]I. Muslea, S. Minton, and C.A. Knoblock, “Hierarchical Wrapper Induction for Semi-Structured Information Sources,” Autonomous Agents and Multi-Agent Systems, vol. 4, no. 1-2, pp. 93-114, 2001. [8]A. Sahuguet and F. Azavant, “Building Intelligent Web Applications Using Lightweight Wrappers,” Data and Knowledge Eng.,vol. 36, no. 3, pp. 283-316, 2001. [9]L. Liu, C. Pu, and W. Han, “XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources,” Proc. Int’l Conf. Data Eng. (ICDE), pp. 611-621, 2000. [10]D. Buttler, L. Liu, and C. Pu, “A Fully Automated Object Extraction System for the World Wide Web,” Proc. Int’l Conf. Distributed Computing Systems (ICDCS), pp. 361-370, 2001. [11]V. Crescenzi, G. Mecca, and P. Merialdo, “RoadRunner: Towards Automatic Data Extraction from Large Web Sites,” Proc. Int’l Conf. Very Large Data Bases (VLDB), pp. 109-118, 2001. [12]C.-H. Chang, C.-N. Hsu, and S.-C. Lui, “Automatic Information Extraction from Semi-Structured Web Pages by Pattern Discovery,” Decision Support Systems, vol. 35, no. 1, pp. 129-147, 2003. [13]B. Liu, R.L. Grossman, and Y. Zhai, “Mining Data Records in Web Pages,” Proc. Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 601-606, 2003. [14]Y. Zhai and B. Liu, “Web Data Extraction Based on Partial Tree Alignment,” Proc. Int’l World Wide Web Conf. (WWW), pp. 76-85, 2005. [15]http://zh.wikipedia.org/wiki/%E6%AD%A3%E8%A6%8F%E8%A1%A8%E5%BC%8F [16]http://ebiquity.umbc.edu/resource/html/id/216/Spam-in-Blogs-and-Social-Media [17]Kyumin Lee, James Caverlee, Steve Webb, “The Social Honeypot Project: Protecting Online Communities from Spammers,” 19th International World Wide Web Conference , Raleigh, pp. 1139-1140, April 2010. [18]Fabr’ıcio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virg’ılio Almeida, “Detecting Spammers on Twitter,” CEAS 2010 - - Seventh annual Collaboration, Electronic messaging, AntiAbuse and Spam Conference July 13-14, 2010, Redmond, Washington, US [19]Alex Har Wang, “Don't Follow Me: Spam Dection In Twitter,” In Proceedings of the International Conference on Security and Cryptography, SECRYPT, pp. 1-10, July 2010. [20]Shih-liang Chang, “Detection Microblog Spam using User Behavior and Content Analysis, ” National Taiwan University of Science and Technology, Master Thesis, Taipei, Taiwan, 2010. [21]http://search.twitter.com/api/ [22]Julie Beth Lovins, “Development of a stemming algorithm,” Mechanical Translation and Computational Linguistics, vol. 11, no. 1-2, pp. 22–31, June 1968. [23]Porter, M.F., “An Algorithm for Suffix Stripping, Program,”, vol. 14, no. 3, pp. 130–137, 1980. [24]http://tartarus.org/~martin/PorterStemmer/ [25]Web Site: Term Weighting Approaches in Automatic Text Retrieval. http://portal.acm.org/citation.cfm?id=866292 [26]Yiming Yang, Jan O. Pedersen, “A Comparative Study on Feature Selection in Text Categorization, ” Proceedings of the Fourteenth International Conference on Machine Learning, page 412-420. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (1997) [27]Quinlan. J. R., “Induction of decision trees,” Machine Learning, No. 1, pp. 81-106, 1986. [28]http://www.csie.ntu.edu.tw/~cjlin/libsvm/
|