(3.234.221.67) 您好!臺灣時間:2021/04/11 16:19
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:林郁馨
研究生(外文):Yuk-Hing Lam
論文名稱:Logistic regression在多分題差異試題功能之檢測及效果量之計算
論文名稱(外文):The Detection and Effect Size Computation of Differential Item Functioning in Polytomous Items with Logistic Regression
指導教授:王文中王文中引用關係
學位類別:碩士
校院名稱:國立中正大學
系所名稱:臨床心理學研究所
學門:社會及行為科學學門
學類:心理學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:166
中文關鍵詞:logistic regression多分題差異試題功能檢測及效果量
外文關鍵詞:polytomouslogistic regressiondifferential item functioningeffect size
相關次數:
  • 被引用被引用:0
  • 點閱點閱:699
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:165
  • 收藏至我的研究室書目清單書目收藏:0
過去的研究建議在使用統計檢定時,需同時考慮其檢定結果之效果量,而近年來對於logistic regression在多分題差異試題弁遄]differential item functioning, DIF)的研究中,Zumbo(1999)提出以R平方作為效果量的測量。本研究根據鄭致寯(2004)的研究,設計情境一和情境二的模擬研究,分別在PCM模式和GPCM模式中,比較logistic discriminant function analysis(LDFA; Miller & Spray, 1993)及ordinal logistic regression(OLR; Zumbo, 1999)檢測DIF試題的檢測率,其中包含正確陽性率(true positive)及錯誤陽性率(false positive),並進一步了解該檢測結果之效果量。此外,最後以實徵資料舉例說明。
研究結果發現,兩種方法在純多分題型對錯誤陽性率的控制均優於混合題型,且正確陽性率也較高,而不論是混合題型或純多分題型,在相同DIF試題含量時,錯誤陽性率和正確陽性率隨著DIF強度愈大而增加,其中錯誤陽性率又會隨著DIF試題含量、兩團體平均能力差異增加,失控的情形愈嚴重,而以OLR對錯誤陽性率的控制較差。
在效果量方面,根據三種R平方計算結果A均發現純多分題型的效果量大於混合題型的效果量;而相同DIF試題含量時,效果量隨著DIF強度增加而愈大,隨著兩團體平均能力差異增加而愈小。此外,分別對三種R平方提出新的判定標準。
綜合以上,logistic regression在純多分題型的檢測效果較佳,而在兩團體平均能力差異大、測驗型態為混合題型時,建議使用LDFA法;但OLR法則可使用R平方進一步了解DIF檢測結果的效果量。
In past studies, it is advisable to examine effect size measure when using statistic tests. Recently, Zumbo (1999) proposed R-square as an effect size measure when using logistic regression to detect differential item functioning (DIF) in polytomous items. In this study, based on 鄭致寯(2004), two experiments were conducted to compare the performance in DIF detection of the logistic discriminant function analysis (LDFA; Miller & Spray, 1993) and ordinal logistic regression (OLR; Zumbo, 1999). Situation 1 focused on the partial credit model (PCM) and Situation 2 on the generalized partial credit model (GPCM). True positive and false positive are used to show the performance in DIF detection. Three computation of R-square were also computed to investigate effect size measures. Furthermore, a real data analysis was performed.
The results show that both methods performed better when tests contained solely polytomous items than when tests contained both dichotomous and polytomous items. When tests contained the same percentages of DIF items, the false positive and true positive rates were increased as the DIF magnitude were increased. The false positive rates were inflated greatly when the percentages of DIF items or the mean latent trait differences between groups were increased. Besides, the OLR performed poorer on the false positive rates.
The effect size measure of three computation of were larger when tests contained solely polytomous items than when tests contained both dichotomous and polytomous items. When tests contain same percentages of DIF items, the effect size measures were increased as the DIF magnitudes were increased, and were decreased as the mean latent trait differences between groups were increased. A new criteria for three computation of R-square is advised.
In conclusion, the logistic regression performs better when tests contain solely polytomous items. When the mean latent trait difference between groups is large or tests contain both dichotomous and polytomous items, the LDFA is recommended. In addition, the OLR provides R-square to measure effect sizes of DIF.
中文摘要 i
英文摘要 ii
目錄 iv
表目錄 vi
圖目錄 viii
第一章、緒論 I
第一節、DIF的檢定程序 2
Logistic regression(LR) 2
Ordinal Logistic Regression(OLR) 4
Logistic Discriminant Function Analysis(LDFA) 5
淨化程序(Purification procedure) 5
第二節、過去DIF研究中相關議題的發現 7
資料產生的模式 7
參照團體與焦點團體的能力分配差異 7
效果量(Effect size) 8
第三節、研究問題與假設 13
第二章、研究方法 14
第一節、研究程序 14
資料產生 14
研究設計 18
分析程序 24
第三章、研究結果 26
第一節、情境一 26
檢測率 26
效果量 38
情境一小結 53
第二節、情境二 55
檢測率 55
效果量 69
情境二小結 84
第三節、實例分析 86
第四章、總結與討論 90
參考文獻 93
附錄A、研究設計流程圖 97
附錄B、GPCM模式混合題資料產生程式碼 98
附錄C、GPCM模式純多分題資料產生程式碼 111
附錄D、PCM模式混合題資料產生程式碼 120
附錄E、PCM模式純多分題資料產生程式碼 121
附錄F、OLR混合題SAS分析程式碼 122
附錄G、OLR純多分題SAS分析程式碼 123
附錄H、LDFA混合題SAS分析程式碼 124
附錄I、LDFA純多分題SAS分析程式碼 125
附錄J、OLR混合題淨化程序程式碼 126
附錄K、OLR純多分題淨化程序程式碼 133
附錄L、LDFA混合題淨化程序程式碼 134
附錄M、LDFA純多分題淨化程序程式碼 140
附錄N、OLR混合題資料整理程式碼 141
附錄O、OLR純多分題資料整理程式碼 150
附錄P、LDFA混合題資料整理程式碼 151
附錄Q、LDFA純多分題資料整理程式碼 156
鄭致寯(2004)。Logistic迴歸在檢測多分題差異試題弁鄐妙蘆G。國立中正大學心理學研究所碩士論文。
錢才瑋、王文中、陳承德、張文信、林宏榮、劉歐(2006)。Rasch分析在醫療界之應用。聞道出版社。
Agresti, A. (1990). Categorical data analysis. New York: Wiley.
Andrich, D. (1978). A rating formulation for ordered response categorites. Psychomerika, 43, 561-573.
Bradley, J. V. (1978). Roubustness? The British Journal of Mathematical and Statistical Psychology, 31, 144-152.
Clauser, B., Mazor, K. & Hambleton, R. K. (1993). The effects of purification of the matching criterion on the identification of DIF using the Mantel-Haenszel procedure. Applied Measurement in Education, 6, 269-279.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In P. W. Holland and H. Wainer (Eds) Differential item functioning (pp.35-66). Hillsdale, NJ: Lawrence Erlbaum Associates.
Fidalgo, A. M., Mellenbergh, G. J. & Muniz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research, 5, 43-53.
French, A. W., & Miller, T. R. (1996). Logistic regression and its use in detecting differential item functioning in polytomous items. Journal of Educational Measurement, 33(3), 315-332.
Hidalgo-Montesions, M. D., & Gόmez-Benito, J. (2003). Test purification and the evaluation of differential item functioning with multinominal logistic regression. European Journal of Psychological Assessment, 19(1), 1-11.
Hidalgo, M. D., & Lόpez-Pina, J. A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64(6), 903-915.
Holland, W. P., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Erlbaum.
Jodoin, M. G., & Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14(4), 329-349.
Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746-759.
Lee, Y. W., Breland, H. & Muraki, E. (2005, April). Comparability of TOEFL CBT essay prompts for difference native language groups. Paper presented at the annual meeting of National Council on Measurement in Education, New Orleans, LA.
Linacre, J. M. (2001). WINSTEPS [computer program]. Chicago, IL: http://www.winsteps.com.
Mantel, N. (1963). Chi-square tests with one degree of freedom: Extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, 58, 690-700.
Mantel, N., & Haenszel, W. (1959). Stasitsical aspects of the analysis of data from retrospective studies of disease. Jornal of National Cancer Institute, 22, 719-748.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.
Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. The American Statistician, 54, 17-24.
Miller, T. R., & Spray, J. A. (1993). Logistic discrimination function analysis for DIF identification of polytomous scored items. Journal of Educational Measurement, 30 (2), 107-122.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176
Nohoon, K., Davenport, E. C., & Davison, M. L. (1998, April). A comparative study of observed score approaches and purification procedures for detecting differential item functioning. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.
OLOGIT macro. (n. d.). Retrieved from http://www.xs4all.nl/~jhckx/spss/mlogist/OLOGIT.html.
Rogers, H. J., & Swaminathan, H. (1993). A comparison of logistic regression and Mantel-Haenszel procedure for detecting differential item functioning. Applied psychological measurement, 17(2), 105-116.
Roussos, L. A., & Stout, W. F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and Mantel-Haenszel Type I error pergormance. Journal of Educational Measurement, 33, 215-230.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph Supplement, 17, 1-100.
SAS Institute. (1999). The LOGISTIC procedure [Computer software]. Cary, NC: Author.
Shealy, R. T., & Stout, W. F. (1993). A model-based standardization approach that sepearates true bias/DIF from group ability differeneces and detects test bias/DIF as well as item bias/DIF. Psychometrika, 58, 159-197.
Shtatland, E. S., Kleinman, K., & Cain, E. M. (2002). One more time about measures of fit in logistic regression. NorthEast SAS Users Group Conference(NESUG), 15. Retrieved from http://www.nesug.org/html/Proceedings/nesug02/st/st004.pdf
Su, Y.-H. & Wang W.-C. (2005). Efficiency of the Mantel, generalized Mantel-Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning for polytomous items. Applied Measurement in Education, 18(4), 313-350.
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27 (4), 361-370.
Wang, W.-C. & Su, Y.-H. (2004a). Effects of average signed area between two item characteristic curves and test purification procedures on the DIF detection via the Mantel-Haenszel method. Applied Measurement in Education, 17, 113-144.
Wang, W.-C., & Su, Y.-H. (2004b). Factors influencing the Mantel and generalized Mantel-Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28(6), 450-480.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning(DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (or ordinal) item scores. Ottawa, Canada: Directorate of Human Resources Research and Evaluayion, Department of National Defense. Retrieved from http://www.edu.ubc.ca/faculty/zumbo/DIF/index.html.
Zumbo, B. D., & Thomas, D. R. (1996). A measure of DIF effect size using logistic regression procedures. Paper presented at the National Board of Medical Examiners, Philadelphia.
Zwick, R., & Ercikan, K. (1989). Analysis of differential item functioning in the NAEP history assessment. Journal of Educational Measurement, 26, 55-66.
Zwick, R., Donoghue, J. R., & Grima, A. (1993). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233-251.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔