|
[1]R. Collobert and J. Weston, "A unified architecture for natural language processing: Deep neural networks with multitask learning," in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160-167. [2]V. Ashish et al., "Attention is All you Need," Neural Information Processing Systems, 2017. [3]A. Conneau et al., "Unsupervised cross-lingual representation learning at scale," arXiv preprint arXiv:1911.02116, 2019. [4]Y. Cui, Z. Yang, and X. Yao, "Efficient and effective text encoding for chinese llama and alpaca," arXiv preprint arXiv:2304.08177, 2023. [5]P. Ennen et al., "Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results," arXiv preprint arXiv:2303.04715, 2023. [6]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018. [7]Y.-J. Lee et al. "Trustworthy AI Dialogue Engine 推動可信任生成式AI發展先期計畫." https://taide.tw/index/about/project-overview (accessed. [8]G. Salton, A. Wong, and C.-S. Yang, "A vector space model for automatic indexing," Communications of the ACM, vol. 18, no. 11, pp. 613-620, 1975. [9]Z. S. Harris, "Distributional structure," Word, vol. 10, no. 2-3, pp. 146-162, 1954. [10]T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013. [11]J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532-1543. [12]C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, pp. 273-297, 1995. [13]L. Breiman, "Random forests," Machine learning, vol. 45, pp. 5-32, 2001. [14]R. O. Duda and P. E. Hart, Pattern classification and scene analysis. Wiley New York, 1973. [15]T. Cover and P. Hart, "Nearest neighbor pattern classification," IEEE transactions on information theory, vol. 13, no. 1, pp. 21-27, 1967. [16]J. R. Quinlan, "Induction of Decision Trees," Machine Learning, 1986, doi: 10.1023/a:1022643204877. [17]D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," nature, vol. 323, no. 6088, pp. 533-536, 1986. [18]S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997. [19]Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, "Albert: A lite bert for self-supervised learning of language representations," arXiv preprint arXiv:1909.11942, 2019. [20]J. Achiam et al., "Gpt-4 technical report," arXiv preprint arXiv:2303.08774, 2023. [21]M. Palatucci, D. Pomerleau, G. E. Hinton, and T. M. Mitchell, "Zero-shot learning with semantic output codes," Advances in neural information processing systems, vol. 22, 2009. [22]O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, "Matching networks for one shot learning," Advances in neural information processing systems, vol. 29, 2016. [23]J. Howard and S. Ruder, "Universal language model fine-tuning for text classification," arXiv preprint arXiv:1801.06146, 2018. [24]G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, 2015. [25]Y. LeCun, J. Denker, and S. Solla, "Optimal brain damage," Advances in neural information processing systems, vol. 2, 1989. [26]J. Wei et al., "Chain-of-thought prompting elicits reasoning in large language models," Advances in neural information processing systems, vol. 35, pp. 24824-24837, 2022. [27]S. Yao et al., "Tree of thoughts: Deliberate problem solving with large language models," Advances in Neural Information Processing Systems, vol. 36, 2024. [28]T. Wolf et al., Q. Liu and D. Schlangen, Eds. Transformers: State-of-the-Art Natural Language Processing. Online: Association for Computational Linguistics, 2020, pp. 38-45 [29]M. Pontiki et al., "Semeval-2016 task 5: Aspect based sentiment analysis," in ProWorkshop on Semantic Evaluation (SemEval-2016), 2016: Association for Computational Linguistics, pp. 19-30. [30]L. Xu et al., "CLUE: A Chinese language understanding evaluation benchmark," arXiv preprint arXiv:2004.05986, 2020.
|