跳到主要內容

臺灣博碩士論文加值系統

(3.235.120.150) 您好!臺灣時間:2021/08/06 03:16
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張雅媛
研究生(外文):Ya-Yuan Chang
論文名稱:融合kernelsmoothing之MMLE法於IRT參數估計之應用
論文名稱(外文):Application of MMLE with kernel smoothing in estimating IRT parameters
指導教授:郭伯臣郭伯臣引用關係
指導教授(外文):Bor-Chen Kuo
學位類別:碩士
校院名稱:國立臺中教育大學
系所名稱:教育測驗統計研究所
學門:教育學門
學類:教育測驗評量學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:86
中文關鍵詞:邊際最大概似法貝氏後驗機率期望值估計法核平滑化法
外文關鍵詞:MMLE/EMEAPkernel smoothing
相關次數:
  • 被引用被引用:2
  • 點閱點閱:292
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
BILOG-MG在應用邊際最大概似法(marginal maximum likelihood estimation/ EM, MMLE/EM)估計試題參數過程中,在估計能力的機率密度函數時,涉及數值運算的部分,BILOG-MG採用直方圖的估計方法,本研究以無參數的方法,藉由核平滑化(kernel smoothing)的方法估計能力的機率密度函數,期望克服直方圖估計所遭遇的問題並提升估計精準度。
是故本研究自行開發基於核平滑化法之邊際最大概似估計法(簡稱MMLE/EM-MIX)之程式,比較能力值為不同分布時,以MMLE/EM-MIX進行參數估計,與BILOG-MG之估計結果比較估計精準度。
研究結果顯示在實驗一中,能力參數估計部分,能力為不同分布時,測驗長度為60題時,大致以MMLE/EM-MIX( )所得之參數估計誤差較小,測驗長度為30題時,大致以BILOG-MG所得之參數估計誤差最小。試題參數估計部分,樣本數較少時,以MMLE/EM-MIX( )所得之參數估計誤差最小;樣本數較大時,以BILOG-MG所得之參數估計誤差最小。
實驗二中,無論受試者的能力分布為何,以MMLE/EM-MIX進行能力參數及試題參數估計,其參數估計誤差大致上均小於BILOG-MG之參數估計誤差,然因 值設定的不同,在不同參數估計及情境下有不同的效果。
In this paper, a modified version of MMLE/EM (Bock & Aitkin, 1981) is proposed. From simulation study, we find that the performance of Bilog-MG (MMLE with EAP) is poor when the distribution of incidental parameter is not normally distributed. There are two modifications in the proposed algorithm. First, kernel density estimation technique is applied to estimate the distribution of incidental parameter in E-step. Second, kernel density estimation technique is applied to estimate the structural parameters and incidental parameters with EAP in M-step. Then we use this methodology to estimate the ability and item parameters iteratively.

In this paper, a simulation experiment based on three-parameter logistic model is conducted to compare the performances of Bilog-MG and the proposed algorithm. In the experiment, three types of distributions of incidental parameters (normal, bi-mode and skewed distributions) are considered. Three values of which means the weight of kernel method are tried. Then root mean square error (RMSE) is used to evaluate the performances of Bilog-MG and the proposed algorithm. Experimental result shows that under most conditions, RMSEs of both ability and item parameters of the proposed algorithm are less than those of Bilog-MG.
第一章 緒論 1
第一節 研究動機 2
第二節 研究目的 3
第三節 研究問題 3
第二章 文獻探討 4
第一節 聯合最大概似法 5
第二節 邊際最大概似法 8
第三節 貝氏估計法 11
第四節 BILOG-MG的參數估計方法 14
第參章 研究方法 15
第一節 MMLE/EM的估計缺點 15
第二節 核平滑化法 17
第三節 基於核平滑化法之貝氏估計法 20
第四節 研究設計 23
第肆章 研究結果 31
第一節 實驗一之結果 32
第二節 實驗二之結果 41
第三節 實驗結果比較 50
第伍章 結論與建議 80
第一節 結論 80
第二節 限制與建議 81
參考文獻 82
中文部分
王暄博(2006)。BIB 與NEAT 設計之水平及垂直等化效果比較。國立台中教育大學教育測驗統計研究所碩士論文,未出版。
王雅苓(1999)。Kernel smoothing在IRT真分數等化的應用與分析。國立彰化師範大學數學研究所碩士論文。
吳慧泯(2001):選項特徵曲線之研究-以核函數之平滑化為估計取向。國立台中師範學院教育測驗統計研究所碩士論文。
陳煥文(2004)。垂直等化連結特性之研究-四種連結方法的比較。國科會專題研究計畫。
黃志傑(2004)。定錨試題分佈對測驗等化之影響。國立臺中師範學院教育測驗統計研究所碩士論文,未出版。
黃美芳(2006)。試題反應理論三參數模式下等化效果之探究。國立台中教育大學教育測驗統計研究所碩士論文,未出版。
楊孟麗、譚康榮、黃敏雄(2003)。心理計量報告:TEPS 2001 分析能力測驗。台灣長期追蹤資料庫。
趙素珍(1998)。IRT軟體估計精準度之比較。國立臺中師範學院國民教育研究所碩士論文,未出版。
劉湘川(2001a):相關加權核平滑化無參數試題選項特徵曲線估計法及其IORS整合模式。第五屆華人社會心理與教育測驗學術研討會。C5.1 ,1-10頁。台北市:中國測驗學會、台灣師範大學。
劉湘川(2001b):核平滑化試題選項特徵曲線與選項關聯結構整合擴充模式。測驗統計年刊第九輯。頁1-18。台中市:國立台中師範學院。


英文部分
Ban, J.C., Hanson, B.A, Yi, Q., & Harris, D.J.(2001) Data Sparseness and Online Pretest Item Calibration/Scaling Methods in CAT. Annual Meeting of the American Educational Research Association.
Birnbaum. A (1968) Some latent trait models and their use in inferring an examinee’s ability. Statistical theories of mental test scores. London: Wesley Publishing Company.
Bock, R. D. & Lieberman, M (1970) Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.
Bock, R. D. & Aitkin, M (1981) Marginal maximum likelihood estimation of item parameters:Application of an EM algorithm. Psychometrika, 46, 443-459.
Baker, F. B. (2004). Item Response Theory:Parameter estimation techniques. New York:Marcel Dekker.
Bowman, A. W. & Azzalini. A. (1997) Applied Smoothing Techniques for Data Analysis, Oxford University Press.
Donoghue, J.R. & Isham,S.P. (1996) Comparing the Effectiveness of Procedures to Detect Item Parameter Drift. Educational Testing Service.
DeMars, C.E.(2005) "Guessing" Parameter Estimates for Multidimensional IRT Models. American Educational Research Association,
Gao.F & Lisue.C (2005) Bayesian or Non-Bayesian:A Comparison Study of Item Parameter Estimation in the Three-Parameter Logistic Model. Applied Measurement in Education, 18(4), 351–380
Glas .C. A.W. & Hendrawan.I (2005) Testing Linear Models for Ability Parameters in Item Response Models. MULTIVARIATE BEHAVIORAL RESEARCH, 40(1), 25–51
Glas, C. A. W., & van der Linden, W. J. (2006). Modeling Variability in Item Parameters in Educational Measurement. Law School Admission Council Computerized Testing Report 01-07.
Gasser, T. & Muller, H. G. (1979). Kernel estimation of regression functions. In Smoothing Techniques for Curve Estimation . Springer-Verlag.
Haebara, T.(1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144-149.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Priciples and applications. Boston: Kluwer-Nijhoff.
Hanson, B. A. & Béguin, A. A. (2002). Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent stima-tion in the Common-Item Equating Design. Applied Psychological Measurement, 26, 3-24.
Jones,D.H & Nediak.M (2000) Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions. Law School Admission Council Computerized Testing Report 00-05
Kolen, M. J. & Brennan, R. L. (1995). Test equating: methods and practices. New York: Springer-Verlag.
Lindley, D.V.(1971) Bayesian statistics: A review. Philadelphia: Society for Industrial and Applied Mathetics.
Lord, F. M.(1980).Applications of item response theory to practional testing problems. Hillsdale, N.J.:Lawrence Erlbawn Associates.
Lord, F. M.(1983) Applications of item response theory to practional testing problems. Hillsdale, N.J.:Lawrence Erlbawn Associates.
Mislevy, R.J. (1986) Bayes modal estimation in item response models. Psychometrica, 51,177-195.
Mislevy, R. J. & Bock, R. D. (1989) BILOG 3:Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software.
Mislevy, R.J. & Stocking, M.L. (1989). A consumer’s Guide to LOGIST and BILOG.. Applied Psychological Measurement,13,57-75.
Mislevy, R. J. & Bock, R. D. (1982). Implementation of the EM algorithm in the estimation of item parameters: The BILOG computer program. Item Response Theory and Computerized Adaptive Testing Conference Proceedings.
Muraki, E., & Bock, R. D. (1996). PARSCALE: IRT based test scoring and item analysis for graded open-ended exercises and performance tasks. Chicago: Scientific Software.
Nadaraya, E. A.(1964) On Estimating Regression, Theory Probability
Application, 10, 186−90.
Priestley, M. B. & Chao, M. T. (1972). Non-parametric function fitting. J. Roy. Statist. Soc.Ser. B 34, 385-392.
Ramsay, J. O. (1991) Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611-630.
Stocking, M.L. & Lord, F.M.(1983). Developing a common metric in item resp-onse theory. Applied Psychological Measurement, 7(2). 201-211.
Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman & Hall.
Swaminathan. H., & Gifford, J.A. (1982) Bayesian estimation in the Rasch model. Journal of Educational Statistics, 7, 175-191.
Thissen, D. (1991). MULTILOG user's guide: Multiple, categorical item analysis and test scoring using Item Response Theory (Version 6.0). Chicago: Scientific Software.
Tang,K.L. & Eignor.D.R (2001) A Study of the Use of Collateral Statistical Information in Attempting to Reduce TOEFL IRT Item Parameter Estimation Sample Sizes. TOFEL Technique Report 17.
Vale, C. D. (1986). Linking item parameters onto a common scale. Applied Psychological Measurement, 10(4), 333-344.
Watson, G. S. (1964) Smooth Regression Analysis, Sankhya - The Indian
Journal of Statistics, Series A, 26, 359−372.
Wood, R. L. & Wingersky, M. S.&Lord, F. M. (1976) LOGIST:A computer program for estimating examinee ability and item characteristic curve parameters. Princeton, NJ: Educational Testing Service.
Wingersky, M. S. & Lord, F. M. (1984). An investigation of methods for redu-cing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347-364.
Wolfgang, H. & Marlene, M. (2004) Nonparametric and Semiparametric Models. Heidelberge New York.
Yamamoto.K (1995) Estimating the Effects of Test Length and Test Time on Parameter Estimation Using the HYERID Model. TOFEL Technique Report 10.
Yao,L, Patz,R.J. & Hanson,B.A. (2002) More Effcient Markov Chain Monte Carlo Estimation in IRT Using Marginal Posteriors. National Council on
Measurement in Education
Zimowski, M. F., Muraki, E. , Mislevy, R. J. & Bock, R.D. (1996). BILOG-MG. Scientific Software lnternational.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top