參考文獻
中文部份
TASA臺灣學生學習成就評量資料庫(2004)。臺北縣:國家教育研究院籌備處。
王暄博(2006)。BIB與NEAT設計之水平及垂直等化效果比較。國立臺中教育大學
教育測驗統計研究所碩士論文。
余民寧(1992a)。試題反應理論的介紹(一)基本概念與假設。研習資訊,9(1),頁5-9。
余民寧(1992b)。試題反應理論的介紹(三) ── 試題反應模式及其特性。研習資訊,9 (2),頁6-10。
許思雯(2008)。題組測驗在三種IRT計分模式能力估計精確性之比較。國立台南大學測驗統計研究所碩士論文。曾玉琳、王暄博、郭伯臣、許天維(2006)。不同BIB 設計對測驗等化的影響。測
驗統計年刊,第十三輯下期,頁209-229。台中市:國立台中教育大學。
彭森明(2003,7月)。如何建置全國性教育資料庫,使其發揮最大價值與功能。文
教新潮,8 (3),37-44 。
楊孟麗、譚康榮、黃敏雄(2003)。心理計量報告:TEPS 2001 分析能力測驗。2009年11月10日,取自http://www.teps.sinica.edu.tw/TestingReport2004-2-10.htm
趙素珍(1997)。BILOG-MG之簡介。測驗統計簡訊雙月刊,18,頁33-54。趙素珍(1998)。IRT軟體估計精準度之比較。國立台中師範學院國民教育研究所
碩士論文。
顏秀聿(2009)。題組測驗等化效果於不同等化設計之比較。國立臺中教育大學
教育測驗統計研究所碩士論文。
英文部分
Allen, N.L., Donoghue, J.R., & Schoeps, T.L. (2001). The NAEP 1998 technical report. Washington, DC: National Center for Educational Statistics.
Allen S., &Sudweeks R.R.(2001). Identifying and managing local item Dependence in
context-dependent item sets. Paper presented at the annual meeting of the
American Educational Research Association, Seattle, WA.
Baker, F. B. (1992). Item Response Theory: Parameter Estimation Techniques. New
York: Marcel Dekker. Inc.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee•s
ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores(pp. 397-479). Reading, MA: Addison-Wesley.
Bradlow,E. T.,Wainer H., & Wang X. (1999). A Bayesian random effects model for
testlets. Psychometrika, 64(2),153-168.
Cureton, E.E.(1965). Reliability and validity: Basic assumptions and experimental
designs. Educational and Psychological Measurement,25,326-346
Ebel,R.L.(1951).Writing the testing item. In E.F. Lindquist(Ed.), Educational
Measurement(pp.185-249).Washington,DC:American Council on Education.
Haladyna T. M.(1992).Context-Dependent Item Sets. Educational Measurement:
Issues and Practice,11(4),21-25.
Hambleton, R.K., & Swaminathan, H. (1985). Item Response Theory: Principles
and Application. Boston, MA:Kivwer-Nijhoff.
Hambleton, R. K., Zaal, N. J., & Pieters, J. P. M. (1991). Computerized Adaptive Testing: Theory, Applications, and Standards. In R. K. Hambleton & N. J. Zaal (Eds.), Advances in Educational and Psychological Testing.
Lee G, Brennan R.L., &Frisbie D. A.(2000). Incorporating the testlet concept in test
score analyses. Educational Measurement:Issues and Practice,19(4),9-15.
Mislevy, R. J. & Bock R. D. (1990). BILOG-3 (2nd ed.): Item analysis and test
scoring with binary logistic models. Mooresvilk: Scientific Software.
Nemhauser, G. L., & Wolsey, L. A. (1999). Integer and Combinatorial Optimization.
New York: John Wiley.
van der Linden, W.J., & Veldkamp, B.P.,& Carlson, J.E. (2004).Optimizing Balanced Incomplete Block Designs for Educational Assessments. Applied Psychological Measurement, 28, 317-331.
Wainer,H.,& Lewis,C.(1990).Toward a psychametrics for testlets.Journal of
Educational Measurement,27(1),1-14
Wainer, H., & Lukhele, R. (1997). How reliable are TOEFL scores? Educational and
Psychological Measurement, 57, 749-766.
Wainer,H., & Kiely,G.L.(1987).Item clusters and computerized adaptive testing: A
case for testlets. Journal of Educational Measurement,24(3),185-201.
Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test
scores? What is the effect of local dependence on reliability? Educational
Measurement: Issues and Practice, 15(1), 22-29.
Wainer, H., & Wang, X. (2000). Using a new statistical model for testlets to score
TOEFL. Journal of Educational Measurement, 37(3), 203-220.
Wainer, H., Bradlow, E. T., & Du, Z. (2000). Testlet response theory: An analog for the 3PL model using in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and Practice (pp. 245-269). Dordrecht, Netherlands: Kluwer.
Wainer, H., Sireci, S. G., & Thissen, D. (1991). Differential testlet functioning:
Definition and detecting. Journal of Educational Measurement, 28, 197-219.
Wang, W.-C., & Wilson, M. (2005). Exploring local item dependence using a
random-effects facet model. Applied Psychological Measurement, 29, 296-318.
Wang, X., Bradlow, E. T., & Wainer, H. (2005). A user’s guide for SCORIGHT (verson 3.0): A computer program for scoring tests built of testlets including a module for covariate analysis (ETS Technical Report RR-04-49). Princeton, NJ: Educational Testing Service.
Wainer H., Bradlow E. T., & Wang, X. (2007). Testlet response theory and its applications. New Yorks Cambridge University Press.
Weiss, D.J., & Yoes, M.E. (1991). Item response theory. In R.K. Hambleton & J. N.
Zaal (eds.), Advances in educational and psychological testing. Boston: Kluwer
Academic Publishers.
Yen,W.M. (1993). Scaling performance assessment: Strategies for managing local
item dependence. Journal of Educational Measurement, 30(3), 187-213
Zimowski, M. F., Muraki, E. ,Mislevy, R. J., & Bock, R. D. (2003). BILOG-MG for
Windows (version 3). Chicago, IL: Scientific Software International, Inc.