跳到主要內容

臺灣博碩士論文加值系統

(2600:1f28:365:80b0:7358:9a99:61b8:7c06) 您好!臺灣時間:2025/01/19 08:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳仁欽
研究生(外文):Ian-Iam Chan
論文名稱:多分題之多向度電腦適性測驗
論文名稱(外文):Multidimensional Computerized Adaptive Testing for Polytomous Items
指導教授:王文中王文中引用關係
指導教授(外文):Wen-Chung Wang
學位類別:碩士
校院名稱:國立中正大學
系所名稱:心理學所
學門:社會及行為科學學門
學類:心理學類
論文種類:學術論文
畢業學年度:93
語文別:中文
論文頁數:105
中文關鍵詞:等級反應模式測量標準誤平均開方誤多分題題間多向度多向度電腦化適性測驗最大事後分佈訊息量行列式法多向度試題反應理論一般部份計分模式
外文關鍵詞:standard errorbetween-item multidimensional testsroot mean square errordeterminant of the posterior informationmultidimensional item response theorythe generalized partial credit modelthe graded response modelpolytomous itemsmultidimensional computerized adaptive testing
相關次數:
  • 被引用被引用:3
  • 點閱點閱:840
  • 評分評分:
  • 下載下載:128
  • 收藏至我的研究室書目清單書目收藏:3
當有多個測驗(能力測驗組合或多面向人格量表)要同時進行電腦化適性測驗(computerized adaptive testing; CAT)時,最簡單的作法就是逐一對分測驗或分量表進行CAT。但這種作法忽略了測驗間的關聯性,無法有效利用各測驗間的關聯性來提升測驗效率。本研究採用題間多向度電腦化適性測驗(multidimensionalcomputerized adaptive testing; MCAT),充分利用測驗間的關聯性,在顧及測驗信度下節省施測題數,提高測驗效率。
本研究先推導出適用於多分題的MCAT 之選題算則及能力估計方程式,再進行模擬實驗。實驗方式為使用自行推導出的二種多分題多向度試題反應模式(等級反應模式、一般部份計分模式),利用Frotran 90 編譯器編寫MCAT 程式,分別在三種分測驗數(2 向度、6 向度、12 向度)及三種向度間相關(r = 0.2、0.5、0.8)等不同情境中模擬10000 名考生進行MCAT。MCAT 流程以Segall(1996)的提出的最大事後分佈訊息量矩陣行列式法(maximum the determinant of the posterior information)作為選題標準,並計算其最大事後分佈(maximum a posterior)估計值。在測驗結束後再計算三種不同的能力估計方式(最大事後分佈、事後分佈期望值、最大概似值)的能力估計值,以及它們的平均開方誤(root mean square error)、誤差(bias)、測驗效率與能力估計測量標準誤(standard error),據此來檢驗在各種狀況下的能力估計準確性及測驗效率。
研究結果證明MCAT 能有效利用各向度間的關聯性來提昇效率。不管在等級反應模式還是一般部份計分模式,隨著向度間的相關程度越高、向度數目越多,MCAT 相對於UCAT(unidimensional computerized adaptive testing)的效率就越高,尤其是短測驗時更為明顯。在能力估計的準確性的檢驗,MCAT 在兩極端的能力值表現亦較UCAT 好。總而言之,MCAT 可以充分利用測驗間的關聯性,來提高測驗的效率。
The most common way to apply CAT to multiple tests (e.g., an ability test battery or multidimensional personality scales) is to adopt a CAT procedure separately oneach subtest or subscale. This unidimensional approach fails to take the correlations between subtests into consideration and therefore does not take this advantage to improve testing efficiency. In this research, I develop algorithms of multidimensional computerized adaptive testing (MCAT) for polytomous items in which the correlations between subtests are taken to reduce the numbers of administered items and to improve testing efficiency.
The thesis begins with derivation of item-selection rules and ability estimation equations for MCAT. Polytomous multidimensional item response models (i.e., the graded response model and the generalized partial credit model) are used to check the MCAT procedures under various conditions: Number of dimensions (2 , 6, and 12) and between-dimension correlations (r = 0.2, 0.5, and 0.8). Root mean square error and standard error measurement are used to assess the relative testing efficiency of MCAT over traditional UCAT (unidimensional CAT).
The results show that the higher the between-dimension correlation, the more the dimensions, the more efficient the MCAT over the UCAT would be, especially when tests are short. In sum, MCAT can utilize between-dimension correlation to improve testing efficiency.
目錄
第一章、緒論......................................................................................................1
第一節、研究動機................................................................................................1
第二節、多向度IRT 的重要性...........................................................................2
第三節、多向度電腦化適性測驗(MCAT)..........................................................4
第四節、多元計分模式的多向度電腦化適性測驗............................................6
第五節、研究問題................................................................................................7
第二章、文獻探討.............................................................................................9
第一節、MIRT 回顧.............................................................................................9
第二節、MCAT 的能力估計.............................................................................10
第二節、MCAT 的選題.....................................................................................13
第三章、研究方法..............................................................................................15
第一節、公式推導..............................................................................................15
第二節、模擬研究設計......................................................................................17
第三節、研究程序..............................................................................................19
第四章、結果..............................................................................................22
第一節、實驗一(GRM)......................................................................................22
第二節、實驗二(GPCM)....................................................................................33
第五章、結論與討論........................................................................................44
第一節、結論..................................................................................................44
第二節、未來研究建議......................................................................................45
參考文獻......................................................................................47
附錄......................................................................................50
附錄A、GRM 的參數產生程式........................................................................50
附錄B、GRM 的MCAT 流程程式...................................................................58
附錄C、GPCM 的參數產生程式......................................................................73
附錄D、GPCM 的MCAT 流程程式................................................................77
附錄E、GRM 的一階導數推導........................................................................92
附錄F、GRM 的二階導數推導........................................................................92
附錄G、GRM 的訊息量矩陣計算公式推導....................................................93
附錄H、GPCM 的一階導數推導.....................................................................94
附錄I、GPCM 的二階導數推導.......................................................................95
附錄J、GPCM 的訊息量矩陣計算公式推導...................................................95
附錄K、EAP 的能力估計公式.........................................................................96

表目錄
表1. 研究變項內容一覽表................................................................................19
表2. 試題參數限制設定一覽表..........................................................................20
表3. GRM 模式下之二向度平均向度的RMSE 表.................................23
表4. GRM 模式下之六向度平均向度的RMSE 表.................................24
表5. GRM 模式下之十二向度平均向度的RMSE 表.............................25
表6. GPCM 模式下之二向度平均向度的RMSE 表.............................34
表7. GPCM 模式下之六向度平均向度的RMSE 表.................................35
表8. GPCM 模式下之十二向度平均向度的RMSE 表.................................36

圖目錄
圖1. 二種類型的多向度測驗模式........................................................3
圖2. GRM 二向度之RMSE (MAP 估計法) .....................................................27
圖3. GRM 六向度之RMSE (MAP 估計法) .....................................................28
圖4. GRM 十二向度之RMSE (MAP 估計法) .....................................................29
圖5. GRM 六向度高相關的MAP 估計Bias(8 題/向度) ......................................30
圖6. GRM 六向度高相關的MAP 估計SE(17 題/向度) ......................................31
圖7. GPCM 二向度之RMSE (MAP 估計法) .....................................................38
圖8. GPCM 六向度之RMSE (MAP 估計法) .....................................................39
圖9. GPCM 十二向度之RMSE (MAP 估計法) .................................................40
圖10. GRM 六向度高相關的MAP 估計Bias(8 題/向度) ................................41
圖11. GPCM 六向度高相關的MAP 估計SE(17 題/向度) ..............................42
參考文獻
Ackerman, T. A. (1991). The use of unidimensional parameter estimates of
multidimensional items in adaptive testing. Applied Psychological
Measurement, 13, 113-127.
Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item
validity from a multidimensional perspective. Journal of Educational
Measurement, 29, 67-91.
Ackerman, T. A. (1994). Using multidimensional item response theory to understand
what items and tests are measuring. Applied Measurement in Education, 18,
255-278.
Adams, R.J., Wilson, M., & Wang, W. C. (1997). The multidimensional random
coefficients multinomial logit model. Applied Psychological Measurement, 21,
1-23.
Andrich, D. (1978). A rating formulation for ordered response categories.
Psychomerika, 43, 561-573.
Baker, F. B. (1992). The graded item response. In Baker, F. B., Item Response Theory:
parameter estimation techniques. (pp.222-250), New York, Marcel Dekker,
Inc.
Bock, R. D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a
microcomputer environment. Applied Psychological Measurement, 6,
431-444.
Bloxom, B. M., & Vale, C. D. (1987, June). Multidimensional adaptive testing: A
procedure for sequential estimation of the posterior centroid and dispersion of
theta. Paper presented at the meeting of the Psychometric Society, Montreal,
Canada.
Chang, H. H., & Ying, Z. (1996). A global information approach to computerized
adaptive testing. Applied Psychological Measurement, 20, 213-229.
Chen, S. Y., Ankenmann, R.D., & Chang, H. H. (2000). A comparison of item
selection rules at the early stages of computerized adaptive testing. Applied
Psychological Measurement, 24, 241-255.
Davey, T., & Parshall, C. G. (1995, April). New algorithms for item selection and
exposure control with computerized adaptive testing. Paper presented at the
annual meeting of the American Educational Research Association, San
Francisco.
Embretson, S. (1980). Multicomponent latent trait models for ability tests.
Psychometrika, 45, 479-494.
Hambleton, R. J., & Swaminathan, H. (1985). Item response theory: Principles and
applications. Boston: Klvwer Nijhoff.
Hattie, J. (1981). Decision criteria for determining unidimensional and
multidimensional normal ogive models of latent trait theory. Armidale,
Asutralia: The University of New England, Center for Behavioral Studies.
Li, Y. H. (2000). An evaluation of the accuracy of multidimensional IRT linking.
Applied Psychological Measurement, 24, 115-138.
Li, Y. H., & Schafer, W. D. (2005). Trait parameter recovery using multidimensional
computerized adaptive testing in reading and mathematics. Applied
Psychological Measurement, 29, 3-25.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores.
Reading, MA: Addison-Wesley.
Lord, F. M. (1980). Applications of item response theory to practical testing problems.
Hillsdale, NJ. Erlbaum.
Luecht, R. M. (1996). Multidimensional computerized adaptive testing in a
certification or licensure context. Applied Psychological Measurement, 20
398-404.
Masters, G. N. (1982). A rasch model for partial credit scoring. Psychometrika, 47,
149-174.
McDonald, R. P. (1985). Unidimensional and multidimensional models for item
response theory. In D. J. Weiss(Ed.), Proceedings of the 1982 Computerized
Adaptive Testing Conference (pp. 127-148). Minneapolis: University of
Minnesota, Department of Psychology, Psychometrics Methods Program.
McDonald, R. P. (2000). A basis for multidimensional item response theory. Applied
Psychological Measurement, 24, 99-114.
Mckinley, R. L., & Reckase, M. D. (1983). MAXLOG: A computer program for the
estimation of the parameters of a multidimensional lolgistic model. Behavior
Research Methods & Instrumentation, 15, 389-390.
Muraki, E. (1992). A generalized partial credit model: Application of an EM
algorithm. Applied Psychological Measurement, 16, 159-176.
Muraki, E. (1993). Information functions of the generalized partial credit model.
Applied Psychological Measurement, 17, 351-363.
Owen, R.J. (1975). A Bayesian sequential procedure for quantal response in the
context of adaptive mental testing. Journal of the American Statistical
Association, 70, 351-356.
Rasch, G. (1960).Probailistic models for some intelligence and attainment tests.
Copenhagen, Denmark: Danish Institute for Educational Research.
Reckase, M. D. (1985). The difficulty of items that measure more than one ability.
Applied Psychological Measurement, 9, 401-412.
Reckase, M. D., & Mckinley, R.L. (1991). The discrimination power of items that
measure more than one ability. Applied Psychological Measurement, 15,
361-373
Reckase, M. D. (1997). The past and future of multidimensional item response theory.
Applied Psychological Measurement, 21, 24-36
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded
scores, 17, Psychometrika Monograph.
Samejima, F. (1974). Normal ogive model on the continuous response level in the
multidimensional latent space. Psychometrika, 39, 111-121.
Segall, D. O. (1996). Multidimensional adaptive testing, Psychometrika, 61, 331-354.
Segall, D. O. (2001). Principles of multidimensional adaptive testing. In W.J. van der
Linden, W.J., & C. A. W. Glas(Eds.). Computer adaptive testing: Theory and
practice. (pp.53-73). Boston, MA: Kluwer acadmemic publishers.
Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability
in computerized adaptive testing. Journal of Educational and Behavior
Statistics, 23, 57-75.
Sympson, J. B., & Hetter, R. D. (1985, October).Controlling item-exposure rates in
computerized adaptive testing. In Proceedings of the 27th Annual Meeting of
the Military Testing Association (pp. 973-977). San Diego, CA: Navy
Personnel Research and Development Center.
Tam, S. S. (1992). A comparison of method for adaptive estimation of a
multidimensional trait. Dissertation Abstracts International,
53(03),1646B.(UMI No. 9221219)
Van der Linden, W. J. (1999). Multidimensional adaptive testing with a minimum
error-variance criterion. Journal of Educational and Behavioral Statistics, 24,
398-412.
Veldkamp, B. P., & van der Linden, W. J. (2002).Multidimensional adaptive testing
with constraints on test content. Psychometrika, 67, 575-588.
Wainer, H. Dorans, N. J., Flaugher, R., Mislevy, R. J., Thissen, D., Eignor, D., Green,
B. F., & Steinberg, L. (2000). Computerized adaptive testing: A primer (2nd
ed.), Mahwah, New Jersey. Lawrence Erlbaum.
Wang, W. C., Wilson, M. R., & Adams, R.J. (1997). Rasch models for
multidimensionality between items and within items. In M. Wilson, G.
Engelhard & K. Draney (Eds.), Objective measurement: Theory into practice.
(Volume 4, pp. 139-155). Norwood, NJ: Ablex.
Wang, W. C., Chen, P. H. (2004) Implementation and measurement efficiency for
multidimensional CAT. Applied Psychological Measurement, 28, 295-316.
Wang, W. C., Chen, P. H., & Cheng, Y. Y. (2004). Improving measurement precision
of test batteries using multidimensional item response models. Psychological
Methods, 9, 116-136.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 何榮桂,〈台灣資訊教育的現況與發展—兼論資訊科技融入科學〉,《資訊與教育》,第87 期,(台北市:中華民國視聽教育協會)。
2. 何榮桂、顏永進,〈資訊融入健康與體育領域教學〉,《教師天地》,112期,(台北市:台北市教師研習中心,2001)。
3. 何榮桂,〈台灣資訊教育的現況與發展—兼論資訊科技融入科學〉,《資訊與教育》,第87 期,(台北市:中華民國視聽教育協會)。
4. 何榮桂,〈他山之石可以攻錯 ~ 亞太地區(臺、港、新、日、韓)資訊教育的發展與前瞻〉,《資訊與教育》,81期,(台北市:教育部,2001)。
5. 何榮桂,〈台灣資訊教育的現況與發展—兼論資訊科技融入科學〉,《資訊與教育》,第87 期,(台北市:中華民國視聽教育協會)。
6. 何榮桂,〈他山之石可以攻錯 ~ 亞太地區(臺、港、新、日、韓)資訊教育的發展與前瞻〉,《資訊與教育》,81期,(台北市:教育部,2001)。
7. 朱則剛,〈建構主義對教學設計的意義〉,《教學科技與媒體雙月刊》,12期,(台北市:中國視聽教育學會,1996)。
8. 朱則剛,〈建構主義對教學設計的意義〉,《教學科技與媒體雙月刊》,12期,(台北市:中國視聽教育學會,1996)。
9. 何榮桂,〈他山之石可以攻錯 ~ 亞太地區(臺、港、新、日、韓)資訊教育的發展與前瞻〉,《資訊與教育》,81期,(台北市:教育部,2001)。
10. 王全世,〈資訊科技融入教學的意義與內涵〉,《資訊與教育雙月刊》,85期,(台北市:中國視聽教育學會,2000)。
11. 朱則剛,〈建構主義對教學設計的意義〉,《教學科技與媒體雙月刊》,12期,(台北市:中國視聽教育學會,1996)。
12. 王全世,〈資訊科技融入教學的意義與內涵〉,《資訊與教育雙月刊》,85期,(台北市:中國視聽教育學會,2000)。
13. 王令宜、高熏芳,〈關注階層量表在教學科技創新推廣上的運用〉,《視聽教育雙月刊》,217期,(台北市:中華民國視聽教育學會,1995)。
14. 王全世,〈資訊科技融入教學的意義與內涵〉,《資訊與教育雙月刊》,85期,(台北市:中國視聽教育學會,2000)。
15. 王令宜、高熏芳,〈關注階層量表在教學科技創新推廣上的運用〉,《視聽教育雙月刊》,217期,(台北市:中華民國視聽教育學會,1995)。