跳到主要內容

臺灣博碩士論文加值系統

(44.200.175.255) 您好!臺灣時間:2022/08/11 13:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:蕭雅萍
研究生(外文):Ya-ping Hsiao
論文名稱:評分規準規範與國小六年級學童英語口語實作表現及自我評量之研究
論文名稱(外文):A Study of Scoring Criteria and Rubrics on Sixth Graders’ Performance Assessment of Spoken English and Self-assessment
指導教授:鄒慧瑛鄒慧瑛引用關係
學位類別:碩士
校院名稱:國立臺南大學
系所名稱:國民教育研究所
學門:教育學門
學類:綜合教育學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:195
中文關鍵詞:自我評分自我評量英語口語實作評量評分規範評分規準
外文關鍵詞:spoken Englishscoring criteriaself-ratingscoring rubricsperformance assessmentself-assessment
相關次數:
  • 被引用被引用:22
  • 點閱點閱:963
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:6
本研究旨在「發展英語口語實作評量及其規準規範,並探討規準規範教學對學生實作評量表現與自我評量的影響」。研究中以台南縣某國小之六年級三個班級的111位學生為研究對象,研究設計採「後測不等組設計」進行規準規範的實驗教學,以研究者自編之「英語客觀式成就測驗」、「英語口語實作評量」及「自我評量問卷」為研究工具,並以GENOVA與SPSS套裝軟體進行統計資料分析。首先探討實作評量工具之心理計量特徵,再進一步了解不同實驗組別之實作評量表現,並探討不同實驗組別的自我評量情形與師生評分間的一致性,最後更深入比較有效與無效自評學生其實作表現的差異情形。主要結果如下所示:
一、英語口語實作評量為一具有良好信效度的測量工具
(一) 效度方面
以構念效度的內容層面來說,評量內容的適切性與題目的代表性由國小英語教師、學科專家及外籍老師進行審視,認為本研究的實作評量具有內容效度。以結構層面來說,規準與規範的擬定、各部份評分要素的決定亦經過學科專家及外籍老師的審查,認為規準與規範的內容能反映出評量題目的結構。
(二) 信度方面
以評分者及作業項目的兩面向交叉設計進行類推性研究,結果顯示類推性係數為.85,亦即在兩位評分者與四項作業項目的評量情境下,本實作評量在學生表現上的排序一致性達.85。
二、規準規範的教學對國小學童的英語口語實作表現並無顯著影響
對不同實驗處理組別的學生來說,結果顯示規準與規範的設立與教學,對國小學童的英語口語實作表現並無顯著影響,進一步分析三組學生在實作評量四部份題目的表現,亦發現三組學生在ABCD四部份的實作表現無顯著差異。
三、規準規範的教學可以幫助國小學童進行自我評量
在自我評量問卷方面,A、B兩組學生「認為規準與規範可以幫助進行自我評量」以及「使用規準與規範進行自我評量」的人數顯著高於否定者;在規準與規範使用情況方面,大部分學生對於規準與規範是持肯定的態度,半數以上的學生會注意到規準中強調的要素。
四、規準與規範的教學無法增加師生間評分的一致性
三組學生有效自評人數並沒有因為規準與規範的設立與教學而有顯著差異。整體而言,學生的自評與老師評分間的相關高達.92。
五、有效與無效自我評量者在實作評量上的表現沒有顯著差異
雖然有將近半數學生達到有效自評的標準,但有效自評者其實作表現並未優於無效自評者;自我評量低估者的實作表現並沒有顯著優於高估者。
The purpose of this study was to design a performance assessment of spoken English and the scoring criteria and rubrics for sixth grade students. Specifically, the study attempted to investigate the effects of scoring criteria and rubrics on students’ performance assessment of spoken English and self-assessment. The 111 subjects in this study were sampled from three classes of sixth graders in a Tainan county school, each school class forming one of the three groups of this study: group A received the instruction of scoring rubrics, group B received the instruction of scoring criteria, and group C was the control group. An experiment with post-test nonequivalent group design was conducted. All subjects were given an English achievement test, a performance assessment of spoken English and a self-assessment questionnaire. The study comprised three phases: first, the study intended to quantify the psychometric characteristics of the performance assessment; second, the study analyzed the degree of consensus between the students’ self-ratings and teachers’ ratings and compared the results between the three groups; finally, the study compared the performance difference between hit and miss self-assessors.
The main results were as follows:
1. The performance assessment of spoken English was supported by high reliability and validity.
a. Two aspects of construct validity were examined. For the content aspect, the coverage of assessment and items included was appropriate. For the structural aspect, the developing of scoring criteria and rubrics and the factors included were intended to reflect the structure of the items of assessment by English specialists and native teachers.
b. The G study for the performance assessment had a p × r × t design. The generalizability coefficient of the assessment was. 85, indicating a high level of reliability and validity.
2. The instruction of criteria and rubrics had no significant effect on overall students’ speaking performance. There was no significant effect on students’ performance of each part of the assessment.
3. The instruction of criteria and rubrics was useful for students’ self-assessment. Students of group A and B had positive attitudes toward the self-assessment questionnaire. Most students accepted that using the criteria or rubrics was beneficial for learning English. More than half of students were sensitive to the factors of the criteria.
4. The instruction of criteria and rubrics didn’t result more hit self-assessors between these three groups. For all the students, the correlation between teachers and students was .92.
5. There was no significant difference in speaking performance between hit and miss self-assessors. Although more than half of the students were hit self-assessors, their performance was not better than miss self-assessors. On the other hand, the performance of underestimating self-assessors was not better than overestimating self-assessors.
第一章 緒論
第一節 研究動機2
第二節 研究目的與待答問題 10
第三節 研究假設 11
第四節 名詞釋義 12
第五節 研究限制 14
第二章 文獻探討
第一節 實作評量 15
第二節 實作評量的設計29
第三節 自我評量 42
第四節 外語能力的自我評量 50
第五節 外語口語測驗55
第三章 研究方法
第一節 研究對象 66
第二節 研究設計 67
第三節 教學設計 70
第四節 研究工具 74
第五節 資料處理 98
第四章 結果與討論
第一節 實作評量的信效度分析100
第二節 實作評量表現分析110
第三節 學生自我評量之分析121
第四節 老師評分與學生自評的一致性分析124
第五節 有效與無效自我評量者之實作評量表現分析126
第五章 結論與建議
第一節 研究結論 129
第二節 建議132
參考書目
一、中文部分137
二、英文部分139
附錄
附錄一 國小英語成就測驗151
附錄二 國小英語口語實作評量預試題目163
附錄三 國小英語口語實作評量D部分預試題目168
附錄四 國小英語口語實作評量題目170
附錄五 國小英語口語實作評量題目(學生用題目卷)177
附錄六 國小英語口語實作評量題目(學生自我評分卷)182
附錄七 自我評量問卷187
附錄八 國小英語口語實作評量題目(老師評分卷)191

表次
表3-1 英語成就測驗之預試對象66
表3-2 正式施測之研究對象資料67
表3-3 本研究的實驗設計模式70
表3-4 英語成就測驗作業類別摘要表 75
表3-5 英語成就測驗雙向細目表76
表3-6 英語成就測驗之各題難度與鑑別度78
表3-7 口語實作評量預試試題A、B、C部分單元題目計畫表82
表3-8 「D、讀讀看」各題的單音節與兩個音節之字數83
表3-9 「D、讀讀看」各題所需時間平均數84
表3-10 學生在「D、讀讀看」各題所需時間之變異數分析摘要表84
表3-11 「D、讀讀看」各題所需時間之事後比較摘要表84
表3-12 「D、讀讀看」各題錯誤次數總和85
表3-13 1、5、7、8題朗讀所需「時間」之變異數分析摘要表85
表3-14 學生在A部分各題字數與得分的平均數87
表3-15 學生在B部分各題得分的平均數 87
表3-16 學生在C部分各題得分的平均數87
表3-17 學生在D部分各題得分的平均數 88
表3-18 口語實作評量正式試題A、B、C部分單元題目表88
表3-19 英語口語實作評量評分規範93
表4-1 實作評量類推性分析之變異成分105
表4-2 決策性研究之變異數分析摘要表109
表4-3 三組學生在成就測驗得分的描述統計111
表4-4 三組學生實作評量得分平均之描述統計111
表4-5 三組學生在實作評量得分平均的共變數分析摘要表112
表4-6 兩組學生實作評量得分平均之描述統計量113
表4-7 兩組學生在實作評量得分平均的共變數分析摘要表113
表4-8 三組學生在A部分試題得分的描述統計116
表4-9 三組學生A部分得分的共變數分析摘要表116
表4-10 三組學生在B部分試題得分的描述統計117
表4-11 三組學生B部分得分的共變數分析摘要表117
表4-12 三組學生在C部分試題得分的描述統計118
表4-13 三組學生C部分得分的共變數分析摘要表119
表4-14 三組學生在D部分試題得分的描述統計119
表4-15 三組學生D部分得分的共變數分析摘要表120
表4-16 問題8與9的期望次數與觀察次數122
表4-17 自我評量問卷答題情況之百分比123
表4-18 規準要素注意情況124
表4-19 自評 ×實驗處理交叉表125
表4-20 有效與無效自評者實作評量表現之描述統計與t考驗摘要表126
表4-21 評分差異的百分比 127
表4-22 高、低估學生在實作表現上的描述統計與t考驗摘要表128
圖次
圖2-1 雙向溝通的說話情境 60
圖2-2 評量的溝通情境61
圖2-3 口語晤談測驗中的角色61
一、中文部分
宋文菊(民88)。國小學童在嬝玨z解實作評量上的表現分析。國立台南師範學院國民教育研究所碩士論文(未出版)。
吳裕益、陳英豪(民87)。測驗與評量。高雄:復文圖書出版社。
佳音、翰林九年一貫策略聯盟(民90)。Cool English第一冊(教育部審定九年一貫暫行版)。台北:佳音事業股份有限公司。
柯啟瑤(民90)。自我評量和交互評量的意義與弁遄C翰林文教雜誌,18期,頁9-13。
桂宜芬、吳毓瑩(民86)。自然科實作評量的效度探討。台南師範學院教育測驗新近發展趨勢學術研討會論文集。
全民英語能力分級檢定測驗。台北市:財團法人語言訓練測驗中心。民91年6月12日,取自:http://www.lttc.ntu.edu.tw/main.htm
郭生玉(民87)。心理與教育測驗。台北:精華書局。
張敏雪(民87)。教室內的實作評量。教育資料文摘,41卷6期,70-73。
張敏雪(民87)。教室內的實作評量。教育資料與研究,20期,24-27。
曾惠敏(民87)。國小分數概念實作評量之發展及其相關研究。國立台南師範學院國民教育研究所碩士論文(未出版)。
單文經(民87)。評介二種多元評量:真實評量與實作評量。北縣教育,25期,頁46-52。
彭森明(民85)。實作評量(Performance Assessment)理論與實際。教育資料與研究,9期,44-48。
敦煌書局股份有限公司(民91)。認識Phone Pass-評分標準。Phone Pass英語口語能力檢定測驗網。民91年6月7日,取自:http://www.phonepass.com.tw/index.asp
蔡凌雯(民90)。全民英檢探究。敦煌英語教學雜誌,33期,頁26-28。
鄒慧英(民86)。實作型評量的品管議題—兼談檔案評量之應用。台南師範學院教育測驗新近發展趨勢學術研討會論文集。
鄒慧英(民87a)。實作評量的研發―以國小說話課為例。測驗與輔導,149期,頁3082-3087。
鄒慧英(民87b)。數學科實作評量的取樣變異性。國小教學評量的反省與前瞻專輯。
鄒慧英(民89)。國小寫作檔案評量應用之探討。國立台南師範學院初等教育學報,13期,141-181。
鄭楓琳(民89)。台南市國小英語教學實施現況與意見調查之研究。國立台南師範學院國民教育研究所碩士論文(未出版)。

盧雪梅(民87)。實作評量的應部B挑戰和難題。教育資料與研究,20期,頁1-5。
謝欽舜(民86)。暢談發音教學。台北:師德教育訓練顧問公司。
羅千純(民90)。全民英檢探究。敦煌英語教學雜誌,33期,頁22-25。
蘇義祥(民87)。實作評量的理論與啟示。測驗與輔導,149期,頁3099-3102。
二、英文部分
Airasian P. W. (1997). Classroom assessment (3rd ed.). New York: McGraw-Hill.
Arter, J. (1990). Performance rubric evaluation form(Metarubric). Portland, OR: Northwest Regional Educational Laboratory.
Barnes, D. (1976). From communication to curriculum. Harmondsworth: Penguin.
Brennan, R. L. (1983). Elements of generalizability theory. Iowa City, IA: ACT.
Brennan, R. L., & Johnson, E. G. (1995). Generalizability of performance assessments. Educational Measurement: Issues and Practice, 14(4), 9-12.

Bond, I. (1995). Unintended consequences of performance assessment: Issues of bias and fairness. Educational Measurement:Issues and Practice, 14(4), 21-24.
Brown, A. L., & DeLoache, J. S. (1978). Skills, plans, and self-regulation. In R. Siegler (Ed.), Children’s thinking: What develops? Hillsdale, NJ: Lawrence Erlbaum.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2),81-105.
Cariaga-Lo, L. D., Richards, B. F., & Frye, A. W. (1992). Understanding learning and performance in context: A proposed model of self-assessment. (ERIC Document Reproduction Service No. ED 347 979)
Chamot, A. U., & O’Malley, J. M. (1994). The CALLA handbook: Implementing the cognitive language learning approach. Reading, MA: Addison Wesley.
Cole, M., & Scribner, S. (1975). Theorizing about socialization of cognition. Ethos, 3, 249-268.
Crehan, K. D. (2001). An investigation of the validity of scores on locally developed performance measures in a school assessment program. Educational and Psychological Measurement, 61(5), 841-848.
Definition of performance assessment. (2000). IL: Chicago Board of Education. Retrieved June 11, 2002, from the World Wide Web: http://intranet.cps.
k12.il.us/Assessments/Ideas_and_Rubrics/Intro_Scoring/Definition_of_P1/definition_of_p1.html
Elliott, S. N. (1995). Creating meaningful performance assessment. (Report No.EDO-EC-94-2). VA: ERIC Clearinghouse on Disabilities and Gifted Education. (ERIC Document Reproduction Service No. ED381 985)
Flavell, J. (1979). Metacognition and cognitive monitoring: A new are of cognitive-developmental inquiry. American Psychologist, 34(10), 906-911.
Frederiksen, J. R., & Collins, A. (1989). A system’s approach to educational testing. Educational Researcher, 18(9), 27-32.
Fulcher, G. (1993). The construct validation of rating scales for oral tests in English as a foreign language. Unpublished doctoral dissertation, University of Lancaster, U.K.
Fulcher, G. (1996). Testing tasks: Issues in task design and the group oral. Language testing, 13(1), 23-51.



Fulcher, G. (1997). The testing of speaking in a second language. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education: Vol. 7. Language testing and assessment (pp. 75-85). Netherlands: Kluwer Academic Publishers.
Gipps, C. (1995). Beyond testing: Towards a theory of educational assessment. London: The Falmer Press.
Glazer, S. (1999). Self-assessment. Teaching PreK-8, 30(2), 93-95.
Harris, M. (1997). Self-assessment of language learning in formal settings. ELT Journal, 51(1), 12-20.
Heaton, J. B. (1990). Writing English language tests (3rd ed.). New York: Longman.
Herman, J. L., Aschbacher, P. R., & Winters, L. (1990). Issues in developing alternative assessments. Paper presented at the annual meeting of the California Educational Research Association, Chicago.
Herman, J. L., Aschbacher, P. R., & Winters, L. (1992). A practical guide to alternative assessments. VA: Association for Supervision and Curriculum Development.


How to create a rubric from scratch. (2000). IL: Chicago Board of Education. Retrieved June 11, 2002, from the World Wide Web: http://intranet.cps.
k12.il.us/Assessments/Ideas_and_Rubrics/Create_Rubric/create_
rubric.html
IELTS-What are the tests like? (2002). Cambridge: The University of Cambridge Local Examinations Syndicate (UCLES). Retrieved from June 12, 2002, from the World Wide Web: http://www.ielts.org/format.htm
Kane, M. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319-342.
Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(5), 5-17.
Kaulfers, W. V. (1944). War-time developments in modern language achievement tests. Modern language journal, 28, 136-150.
Khattri, N., & Sweet, D. (1996). Assessment reform: Promises and challenges. In M. B. Kane & R. Mitchell (Eds.), Implementing performance assessment: Promises, problems , and challenges (pp.1-22). NJ: Lawrence Erlbaum Associates.
Klenowski, V. (1995). Student self-evaluation processes in student-centered teaching and learning contexts of Australia and England, Assessment in Education, 2, 145-163.
Lightbown, P. M., & Spada, N. (1999). How languages are learned (2nd ed.). Oxford: Oxford University Press.
Linn, R. L., Baker, E. L., & Dunbar. (1991). Complex, performance-based assessment: Expectations and validation criteria. Educational Researcher, 20(8), 15-21.
Mabry, L. (1999). Writing to rubric. Phi Delta Kappan , 80(9), 673-679.
McMillan, J. H. (1997). Classroom assessment: Principles and practice for effective instruction. Boston: Allyn and Bacon.
McNamara, T. (1996). Measuring second language performance. London: Longman.
McNamara, M. J., & Deane, D. (1997). Self-assessment activities: Toward language autonomy in language learning. TESOL Journal, 5(1), 17-21。
Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13-23.
Montgomery, K. (2000). Classroom rubrics: Systemizing what teachers do naturally. Clearing House, 73(6), 324-328.
Munby, S. (1989). Assessing and recording achievement. Hemel Hempstead: Prentice Hall.
Nitko, A. J. (1996). Educational assessment of students. New Jersey: Merrill.
North Carolina State Dept. of Public Institution, Raleigh. Instructional Services. (2001). Oral language assessment in the foreign language class (Planning, conducting, managing). The positive dream. (ERIC Document Reproduction Service No. ED 454 738)
Nunan, D. (1988). The learner-centered curriculum. Cambridge: Cambridge University Press.
Oosterhof, A. (1994). Classroom application of educational measurement (2nd ed.). New York: Merril.
Opitz, M., & Glazer, S. M. (1995). Self-assessment and learning centers: Do they go together? Teaching Pre K-8, 25(4), 104-106.
Orsmond, P., & Merry, S. (1997). A study in self-assessment: Tutor and students’ perceptions of performance criteria. Assessment & Evaluation in Higher Education, 22(4), 357-369.
Orsmond, P., Merry, S., & Reiling, K. (2000). The use of student derived marking criteria in peer and self-assessment. Assessment & Evaluation in Higher Education, 25(1), 23-28.


Oscarson, M. (1997). Self-assessment of foreign and second language proficiency. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 175-187). Netherlands: Kluwer Academic Publishers.
Paris, S.G., & Ayres, L. R. (1994). Becoming reflective students and teachers with portfolios and authentic assessment. Washington: American Psychological Association.
Popham, W. J. (1999). Classroom assessment: What teachers need to know (2nd ed.). Needham Heights, MA: Allyn and Bacon.
Popham, W. J. (2000). Modern educational measurement: Practical guidelines for educational leaders (3rd ed.). Los Angeles: Allyn and Bacon.
Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. R. Gifford, & M. C. O’ Connor (Eds.), Changing assessments: Alternative view of aptitude, achievement, and instruction (pp. 37-75). Boston: Kluwer.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability Theory A Primer. CA: SAGE publications, Inc.
Shavelson, R. J., Baxter, G. P., & Gao, X. (1993). Sampling variability of performance assessments. Journal of Educational Measurement, 30(3), 215-232.
Shepard, L. A., Flexer, R. J., Hiebert, E. H., Marion, S. F., Mayfield, V., & Weston, T. J. (1996). Effects of Introducing Classroom Performance Assessments on Student Learning. Educational Measurement: Issues and Practice, 15(3), 7-18.
Shipman, M. (1983). Assessment in primary and middle schools. London: Croom Helm.
Skillings, M. J., & Ferrell, R. (2000). Student-generated rubrics: Bringing students into the assessment process. Reading Teacher, 53(6), 452-455.
Smith, C. (1997). Student self-assessment at St Bernard’s primary school. Primary Educator, 3(4), 7-9.
Sollenberger, H. E. (1978). Development and current use of the FSI oral interview test. In J. L. D. Clark (Ed.), Direct testing of speaking proficiency: Theory and application (pp.1-12). N J: Princeton.
Stanford, P. (2001). Authentic assessment for intervention. Intervention in School & Clinic, 36(3), 163-167.
Stiggins, R. J. (1987). Design and development of performance assessment. Educational Measurement: Issues and Practice, 6(3), 33-42.
Stiggins, R. J. (1994). Student-centered assessment. New York, NY: Macmillan.

Stiggins, R. J. (1997). Student-involved classroom assessment (3rd ed.). New Jersey: Merrill.
Stipek, D. J., & MacIver, D. (1989). Developmental change in children’s assessment of intellectual competence. Child Development, 60, 521-538.
Sullivan, K. & Hall, C. (1997). Introducing students to self-assessment. Assessment & Evaluation in Higher Education, 22(3), 289-305.
Swan, M., & Smith, B. (1988). Learner English. Cambridge: Cambridge University Press.
Testing for proficiency: The ACTFL Oral proficiency interview. (2002). NY: American Council on the teaching of Foreign Languages. Retrieved 2002, June 11, 2002, from the World Wide Web: http://www.actfl.org/
Tombari, M. L., & Borich, G. D. (1999). Authentic assessment in the classroom: Application and practice. NJ: Merrill.
Towler, L., & Broadfoot, P. (1992). Self-assessment on the primary school. Educational Review, 44(2), 137-141.
Trice, A. D. (2000). A handbook of classroom assessment. NY: Longman.
Underhill, N. (1987). Testing spoken language. Cambridge: Cambridge University Press.

van Kraayenoor, C. E. (1993). Toward self-assessment of literacy learning. San Antonio, Texas: Keynote address to the International Reading Association.
Vygotsky, L. S. (1962). Thought and language. Cambridge, MA: MIT press.
Why scoring rubrics are important? (2000). Chicago Board of Education. Retrieved June 11, 2002, from the World Wide Web: http://intranet.cps.
k12.il.us/Assessments/Ideas_and_Rubrics/Intro_Scoring/Rubric_Importance/rubric_importance.html
Wiggins, G. (1998). Educative assessment. San Francisco, CA: Jossey-Bass.
Wilson, K. M. (1996). Validity of global self-ratings of ESL speaking proficiency based on an FSI/ILR-referenced scale: An empirical Assessment. In collaboration with R. Lindsay, Educational Testing Service. NJ: Princeton.
Yancey, K. B. (1998). Reflection, self-assessment, and learning. Clearing House, 72(1), 13-17.
Young Learners - English Tests for Young Learners. (2002, June). Cambridge: Cambridge EFL on-line. Retrieved June 11, 2002, from the World Wide Web: http://www.Cambridge-efl.org.uk/exam/young/bg_yle.htm


Zimmerman, B. J. (1989). Models of self-regulated learning and academic achievement. In B. J. Zimmerman & D. Schunk (Eds.), Self-regulated learning and academic achievement: Theory, research, and practice. NY: Springer-Verlag.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top