跳到主要內容

臺灣博碩士論文加值系統

(34.226.244.254) 您好!臺灣時間:2021/08/01 04:09
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王伯雅
研究生(外文):Po-Ya Angela Wang
論文名稱:詞彙穩定的秘密—對各語言學面向的質性與量化分析
論文名稱(外文):Secrets of Lexical Conventionalization: A Quantitative and Qualitative Exploratory Analysis on Linguistic Factors
指導教授:謝舒凱謝舒凱引用關係
指導教授(外文):Shu-Kai Hsieh
口試委員:劉德馨高照明
口試委員(外文):Teh-Sin LiuZhao-Ming Gao
口試日期:2015-07-22
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:語言學研究所
學門:人文學門
學類:語言學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:英文
論文頁數:175
中文關鍵詞:詞彙穩定詞彙生命新詞詞彙擴散網路語言語言改變量化語言學語料庫字典學
外文關鍵詞:conventionalizationlife cycle of wordsneologismdiffusioninternet languagelanguage changequantitative linguisticscorpuslexicology
相關次數:
  • 被引用被引用:0
  • 點閱點閱:568
  • 評分評分:
  • 下載下載:117
  • 收藏至我的研究室書目清單書目收藏:0
前人的在語詞上的研究有許多見解,主要可分為兩部分:語言理論上的分析和語言處理的應用。理論上的分析主要包含三個角度:研究語言現象歷史發展的歷史語言學,新詞共時表現的詞彙語義學,預測詞彙存留的計算語言學。他們都可以運用於字典學,設計語言教材,構建自然語言處理所需的資源。然而,在相關研究中少有同時採用量化和質性角度的探討。其次,前人研究中所選取的目標詞彙有其侷限性。同時,時間訊息以及各類語言學變相都應納入討論以及更深刻的了解詞彙穩定的肇因。詞彙以概念連結的組構模式以及隨著時間積累的心理詞庫都應在探討本議題時納入考量。因此,本文欲以量化和質性觀點切入研究,提出詞彙可能有的三種生命形態(擴散、穩定、失去活性),透過時間資訊以及六種語言學面向(聲韻、構詞、語意、句法、語用、社會語言學)來探討本議題,並期能將結果運用於詞彙預測以及資源建構。
量化分析的角度來看,線性回歸模型用以研究區分不同時間點詞彙的語言學特色。語用學顯著地解釋了1950年以前存在的詞彙期使用穩定度的高低,而1950年以後所造的詞是否在語言中穩定使用則有賴語法面向的因素來解釋。這樣的結果暗示詞彙活得越久越與經驗性和語用性知識相關,但對於近期新生的詞彙句法結構的結合性對於其是否會被穩定使用有著決定性的意義。新起的擴散詞以及存在數世紀的詞彙在使用穩定度上十分相似,但藉由邏輯回歸模型可以發現數音節、近義詞數、同義詞數目、在回文中使用的活躍度、是否為外來語成功區別擴散詞以及存在數世紀的詞彙。另方面,語言學特質的角度而言1950年後新生的詞彙與近來新起的擴散詞有相似的語言學特徵。所以將1950年以後新生的詞作為訓練資料建構預測模型來理解現下擴散的詞未來發展的趨勢。結果顯示目標詞前後共現的不同詞彙數有顯著的預測能力,達到0.6335的準確度。
質性分析的面向從同義詞間的競爭來探討,句法上的兼容性和該詞概念關係的豐富度應為是否能贏過其他同義詞而被大量使用的關鍵。此外,不同時間點生成的詞在貼文與回文中有不同的使用活性。不同於其他兩者擴散詞在回文中較為活躍,這暗示他們在類似回覆導向的口語風格中以及互動中較易擴散。根據這些研究發現,我們可以進一步應用於增補詞彙於語言資源中。語用上的穩定度、語法上的結合性,以及語意可作為增補詞彙的標準,較廣泛使用的異體詞,語意表達中較穩定使用的詞彙,以及來自同一概念經歷詞彙化的詞項皆收錄於增補後的詞,由此可知所提標準的涵蓋性。


Previous studies have many insights in understanding lexical items. They can be generally captured into two parts: linguistic analysis and application. Linguistic analysis mainly includes three angles: studies on historical development of linguistic phenomenon from Historical Linguistics, probes on synchronic emergence of neologisms from Lexical Semantics, and prediction models built for understanding survival of words from Computational Linguistics. They can all be applied on including words for Lexicology, designing language teaching materials, and constructing resources for Natural Language Processing. However, there is rarely a single work include quantitative and qualitative methods simultaneously. Second, the generality of included target words in previous studies needs reconsideration. Meanwhile, temporal information of lexical items and various linguistic aspects should be invited to probe deeper for understanding factors contributing to conventionalization of a word. The conceptual associations of organization in mental lexicon and temporal accumulation for mental lexicon should all be considered when facing this issue. Thus, this thesis is aimed to conduct quantitative profiling and qualitative analysis as well as to apply them in constructing lexical resources with proposing three life stages of lexical items (diffusion, conventionalization, and inactivation), including target words from different temporal points, and adopting linguistic variables from six linguistic aspects (phonology, morphology, semantics, syntax, pragmatics, and sociolinguistics).
In quantitative profiling, the linear regression model has built to distinguish words from different temporal points. The result shows that pragmatics can best account behavioral performance of words before 1950 and syntax can best capture words after 1950, which implies that words live longer may correlated with rich experiential and pragmatic using knowledge, but for those who are born recently their structurally syntactic compatibility plays important role in deciding their fluctuation in use. Diffused words are similar to words existing over centuries in their Revised Constant U. From logistic regression model it is found that number of syllable, number of near-synonym, number of synonym, activeness in used in comments, and borrowing from other language or not are statistically significant variables that distinguish diffused words and words existing over centuries. On the other hand, words born after 1950 and diffused words are quite similar in their linguistic characteristics. Prediction model based on training data from words after 1950 are built to foretell potential life of diffused words. It shows that number of types co-occurring before target words is statistically valued in prediction. With words before 1950 and recent diffused words as test data the accuracy of model reaches 0.6335.
Qualitative analysis on competitions among words from the same synset indicates that structural compatibility and involved conceptual relations may be the key for one lexical item to winning over the other synonymous member. Besides, words coming from different temporal points show differences in their activeness in being used in comments and posts on PTT. Diffused words are more active in comments, which implies they are more correlated with feedback oriented oral style and diffused in interaction. With these findings we can further apply them on proposing suggestions for lexicology. Pragmatically stable in use, syntactic compatibility, and semantically number of senses are taken as standard to expanding inclusion of words. The updated inclusion of popularly used variants, more stable semantic representation, and words lexicalized from the same conceptual experiences indicates the inclusiveness of proposed standards.


Acknowledgements ii
摘要 iii
Abstract v
Table of Contents vii
List of Figures x
List of Tables xiii
List of Appendices xv
Chapter 1. Introduction 1
1.1. Background 1
1.2. Purpose 3
1.3. Organization 4
Chapter 2. Literature review 6
2.1. Qualitative Discussion from Historical linguistics and Lexical Semantics 7
2.1.1. Historical Linguistics: Grammaticalization, Degrammaticalization, Lexicalization, and Exaptation 7
2.1.2. Lexical Semantics on Neology 12
2.2. Quantitative Analysis on “Life Cycle” of Lexical Items 18
2.2.1. Analysis on “Life Cycle” of Different Lexical Items 18
2.2.2. Quantitative Profiling on “Life” of Lexical Items 22
2.3. Applications 26
Chapter 3. Methodology 29
3.1. Scope of Study 29
3.1.1. Unit of Observation 29
3.1.2. Types of Target Words 31
3.1.3. Potential Limitation and Corresponding Compensation in Current Study 34
3.1.4. Proposed Life Stages 35
3.1.5. Operational Definitions on Predicted Value 39
3.2. Resource for Collecting Target words 42
3.2.1. Kim (2006) and Chang and Ahrens (2008) 43
3.2.2. Google Books Ngram Corpus (GBNC) 43
3.2.3. Web 45
3.2.4. Newspaper 46
3.2.5. Chinese Wordnet 46
3.3. Categorization on Target Lexical Items 48
3.3.1. Target Lexical Items for Understanding Diffusion 49
3.3.2. Target Verbs for Understanding Conventionalization 50
3.4. Proposed Linguistic Predictors for Understanding Stabilization 52
3.4.1. Phonology 56
3.4.2. Morphology 57
3.4.3. Syntax 66
3.4.4. Semantics 70
3.4.5. Sociolinguistics 73
3.4.6. Pragmatics 75
Chapter 4. Exploratory Analysis and Modeling 81
4.1. Revised Constant U in Three Types of Targets 81
4.2. Performance of Linguistic Factors in Target Words 88
4.3. Linguistic Regression Models for Three Sets of Words 99
4.3.1. Revised Constant U and Phonology 102
4.3.2. Revised Constant U and Morphology 103
4.3.3. Revised Constant U and Semantics 104
4.3.4. Revised Constant U and Syntax 105
4.3.5. Revised Constant U and Pragmatics 105
4.3.6. Revised Constant U and Sociolinguistics 107
4.3.7. Logistic Regression Model 109
4.4. Qualitative Analysis on Members of Synset 114
4.5. Application: Inclusion of Lexical Items for Lexicology 121
Chapter 5. General discussion and conclusion 128
5.1. Conclusion 128
5.2. Implication and future study 129
References 131
Appendices 141


Aitchison, & Lewis. (1996). The mental word web: Forgeting the links. Svartvik.
Aitchison, J. (2001). Language change: progress or decay? Cambridge University Press.
Aitchison, J. (2012). Words in the mind : an introduction to the mental lexicon. Chichester, West Sussex ; Malden, MA : Wiley-Blackwell.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, pp. 716-23.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control(19), pp. 716–23.
Algeo, J. (1980). Where do all the new words come from. American Speech, 55(4), pp. 264-77.
Altmann, E.G. , Zakary L. , & Whichard, Motter, A.E. (2013). Identifying Trends in Word Frequency Dynamics. Journal of Statistical Physics, 151(1-2), pp. 277-288.
Altmann, E.G., Pierrehumbert, J.B., & Motter, A.E. (2011). Niche as a determinant of word fate in online groups. PloS one, 6(5).
Baayen, R. H. (2009). Corpus linguistics in morphology: morphological productivity. In Corpus linguistics. An international handbook (pp. 900-19.).
Barnhart. (2007). A calculus for new words. Dictionaries(28), pp. 132–138.
Barnhart, C. (1978). American lexicography, 1945–1973. American Speech, 53(2), pp. 83-140.
Bauer. (1983). English Word-formation. Cambridge University Press, Cambridge.
Betz, W. (1949). Deutsch und Lateinisch: Die Lehnbildungen der althochdeutschen Benediktinerregel. Bonn: Bouvier.
Boulanger, V. (1997). What Makes a Coinage Successful?: The Factors Influencing the Adoption of English New Words. University of Georgia.
Brekle, H. (1978). Reflections on the conditions for the coining and understanding of nominal compounds. In U. Wolfgang , & W. Meid (Ed.), Proceedings of the 12th International Congress of Linguists (pp. 68-77). Innsbruck: Universit
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文