(100.26.179.251) 您好!臺灣時間:2021/04/12 21:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:王暄博
研究生(外文):Hsuan-Po Wang
論文名稱:BIB與NEAT設計之水平及垂直等化效果比較
指導教授:郭伯臣郭伯臣引用關係
指導教授(外文):Bor-Chen Kuo
學位類別:碩士
校院名稱:國立臺中教育大學
系所名稱:教育測驗統計研究所
學門:教育學門
學類:教育測驗評量學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:中文
論文頁數:97
中文關鍵詞:水平等化垂直等化估計準確指數平衡不完全區塊設計定錨不等組設計
外文關鍵詞:horizontal equatingvertical equatingaccuracy of estimatebalanced incomplete block designnon-equivalent groups with anchor test design
相關次數:
  • 被引用被引用:14
  • 點閱點閱:449
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:104
  • 收藏至我的研究室書目清單書目收藏:0
本研究主要目的探討使用平衡不完全區塊設計(balanced incomplete block design, BIB)與定錨不等組設計(non-equivalent groups with anchor test design, NEAT)在進行測驗等化時,對於水平等化測驗及垂直等化測驗等化之等化效果。本研究基於試題反應理論中三參數Logistic模式以模擬實驗方式進行,探討不同施測人數、垂直定錨試題數及試題區塊數於BIB與NEAT等化設計中之等化效果。研究中主要控制變項為:(一)施測人數為5460人、7500人及10000人;(二) 垂直定錨試題數為3題、6題及9題;(三)試題區塊數為7個、9個及13個。使用之等化及參數估計軟體為BILOG-MG。
研究結果發現:
1. 參數估計風險值隨著受試人數增加而減少;
2. 參數估計風險值隨著垂直定錨試題數增加而減少;
3. 在水平等化測驗中,BIB設計試題參數估計風險值較優於NEAT設計;NEAT設計受試者能力估計風險值較優於BIB設計;
4. 在垂直等化測驗中,BIB設計試題鑑別度參數估計風險值較優於NEAT設計,NEAT設計受試者能力值、試題難度參數及試題猜測度參數估計風險值較優於BIB設計。
For large-scale assessments, the spectrum of subject matter is usually wide and the simultaneous sampling of items and students is a practical way to obtain representative indications of student performance. Balanced incomplete block design (BIB) and non-equivalent groups with anther test design (NEAT) are two popular test equating methods for this condition. The purpose of this study is to compare the performances of BIB and NEAT designs for horizontal equating and vertical equating of large-scale assessment.
Two linking methods, one for BIB and the other for NEAT, are compared in this study. The effects of numbers of anchor items, the numbers of block, administrated items, and examinees are explored. The results show that: 1. the estimation error decreases as the number of anchor items increases; 2. the estimation error decreases as the number of examinees increases; 3. BIB outperforms NEAT in estimating item parameters and NEAT outperforms BIB in estimating abilities of examinees for horizontal equating; 4. BIB outperforms NEAT in estimating item discrimination parameter and NEAT outperforms BIB in estimating abilities of examinees, item difficulty parameter and item guessing parameter for vertical equating.
目 錄
第一章 緒論 ------------------------------------------------------- 01
第一節 研究動機與目的 ---------------------------------------------- 01
第二節 名詞釋義 --------------------------------------------------- 06
第二章 文獻探討 --------------------------------------------------- 08
第一節 試題反應理論 ------------------------------------------------ 08
第二節 測驗等化的意義與種類 ----------------------------------------- 11
第三節 測驗等化設計 ------------------------------------------------ 14
第四節 古典測驗理論等化方法 ----------------------------------------- 21
第五節 試題反應理論等化方法 ----------------------------------------- 25
第三章 研究方法 ---------------------------------------------------- 28
第一節 研究步驟 --------------------------------------------------- 28
第二節 等化設計之變項設定 ------------------------------------------- 30
第三節 BIB等化設計 ------------------------------------------------ 34
第四節 NEAT等化設計 ----------------------------------------------- 42
第五節 研究工具 --------------------------------------------------- 48
第四章 研究結果 ---------------------------------------------------- 50
第一節 BIB設計等化後估計結果 ---------------------------------------- 50
第二節 NEAT設計等化後估計結果 --------------------------------------- 63
第三節 BIB與NEAT設計等化後估計結果綜合比較 --------------------------- 73
第五章 結論與改進建議 ----------------------------------------------- 92
第一節 結論 ------------------------------------------------------- 92
第二節 改進建議 --------------------------------------------------- 93
參考文獻 ---------------------------------------------------------- 94
中文部份 ---------------------------------------------------------- 94
英文部份 ---------------------------------------------------------- 95
中文部份
王寶墉(民84)。現代測驗理論。臺北市:心理出版社。
李源煌、楊玉女(民 89)。以專業導向為準則之大學聯考草案。文教新潮,5(1)。
李源煌、楊玉女(民 89)。建立學科評量量尺之理論基礎。中國測驗學會測驗年刊,47輯,1期,頁95-116。
李文忠(民84)。以無參數反應理論之等化模式探討測驗等化與能力成長曲線。國立台中師範學院國民教育研究所碩士論文,未出版。
吳裕益(民80)。IRT等化法在題庫建立之應用。初等教育學報,第四輯,pp.319-365。國立臺南師範學院初等教育學系。
洪碧霞、吳裕益、陳英豪(民80)。IRT參數量尺化系列研究:考生人數及能力特質,共同試題題數及難度特質,及連結方法等因素對連結效益影響之探討。國科會報告,NSC 80-0301-H-024-01。
陳煥文(民93)。垂直等化連結特性之研究-四種連結方法的比較。國科會專題研究計畫。
曾玉琳、王暄博、郭伯臣、許天維(民95)。不同BIB設計對測驗等化的影響。測驗統計年刊,第十三輯下期,頁209-229。台中市:國立台中教育大學。


英文部份
Allen, N.L., Donoghue, J.R., & Schoeps, T.L. (2001). The NAEP 1998 technical report. Washington, DC: National Center for Educational Statistics.
Angoff, W.H. (1984). Scaling, Norming, and Equating. Princeton, NJ: Educational Testing Service.
Baker, F. B. (1992). Item Response Theory: Parameter Estimation Techniques. New York: Marcel Dekker. Inc.
Braun, H.I., & Holland, P.W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland and D. B. Rubin (Eds.), Test equating ( pp.9-49). New York:Academic.
Driscoll, D. P. (2002), 2001 MCAS Technical Report. Malden MA:Massachusetts Department of Education
Dorans, N. J. & Holland, P. W. (2000). Linking Scores from Multiple Instruments.
Evaluation of National and State Assessments of Evaluation. Board on Educational Testing and Assessment. Washington, DC: National Academy Press.
Hanson, B.A. & Beguin, A.A. (2002). Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent estimation in the Common-Item Equating Design. Applied Psychological Measurement, 26, 3-24.
Hambleton, R.K., & Swaminathan, H. (1985). Item Response Theory: Principles and Application. Boston, MA:Kivwer-Nijhoff.
Haebara, T. (1980). Equating Logistic Ability Scales by a Weighted Least Squares Method. Japanese Psychological Research, 22, 144-149.
Kolen, M. J. (2000). Issues in Combing State NAEP and Main NAEP. In J. W. Pellegrino, L. R. Jones, & K. J. Mitchell, (Eds.), Grading the Nation’s Reportcard: Research from the Evaluation of NAEP. Committee on the
Kuehl, R. O. (2000). Design of Experiments: Statistical Principles of Research Design and Analysis. CA: Duxbury Press.
Kim, S.H. & Cohen, A.S. (1998). A Comparison of Linking and Concurrent Calibration Under Item Response Theory. Applied Psychological Measurement, 22, 131-143.
Kolen, M.J. & Brennan, R.J. (1995). Test Equating: Methods and Practices. New York: Springer-Verlag.
Klein, L. W., & Jarjoura, D. (1985). The importance of content representation for common-item equating with non-random groups. Journal of Educational Measurement, 22, 197-206.
Lord, F. M. (1980). Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Lawrence Erlbaum.
Mislevy, R. J. & Bock R. D. (1990). BILOG-3 (2nd ed.): Item analysis and test scoring with binary logistic models. Mooresvilk: Scientific Software.
Mislevy, R. J. & Bock R. D. (1982). Implementation of the EM algorithm in the estimation of item parameters: The BILOG computer program. In: Item Response Theory and Computerized Adaptive Testing Conference Proceedings (Wayzata, MN).
NAEP Mathematics Consensus Project (2001). Mathematics Framework for The 1996 and 2000 National Assessment of Educational Progress. National Assessment Governing Board, U.S. Department of Education.
Nattional Research Council. (1999). Uncommon Measures: Equivalency and Linkage of Educational Tests. Washington, DC: Author.
Nemhauser, G. L., & Wolsey, L. A. (1999). Integer and Combinatorial Optimization. New York: John Wiley.
Petersen, Nancy S., Kolen, Michael J., Hoover, H.D. (1993). Scaling, Norming, and Equating. In R.L. Linn (Ed.), Educational Measurement (3rd ed., pp221-262). New York: Macmillan.
Stocking, ML. & Lord, F.M. (1983). Developing a Common Metric in Item Response Theory. Applied Psychological Measurement, 7(2).201-211.
Tianyou, W. (2005). An Alternative Continuization Method to the Kernel Method in von Davier, Holland and Thayer's (2004) Test Equating Framework.
van der Linden, W.J., & Veldkamp, B.P.,& Carlson, J.E. (2004).Optimizing Balanced Incomplete Block Designs for Educational Assessments. Applied Psychological Measurement, 28, 317-331.
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. New York: Springer.
Weiss, D.J., & Yoes, M.E.(1991). Item response theory. In R.K. Hambleton & J. N. Zaal (eds.), Advances in educational and psychological testing. Boston: Kluwer Academic Publishers.
Zimowski, M.F., Muraki, E., Mislevy, R.J. & Bock, R.D. (2003). BILOG-MG. Scientific Software lnternational.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔