Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69-81.
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573.
Baker, F. B. (1992). Equating tests under the graded response model. Applied Psychological Measurement, 16, 87-96.
Baker, F. B. (1993). EQUATE 2.0: A computer program for the characteristic curve method of IRT equating. Applied Psychological Measurement, 17, 20.
Baker, F. B. (1997). Emprirical sampling distributions of equating coefficients for graded and nominal response instruments. Applied Psychological Measurement, 21, 157-172.
Baker, F. B., & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28, 147-162.
Cohen, A. S., & Kim, S. H. (1998). An investigation of linking methods under the graded response model. Applied Psychological Measurement, 22(2), 116-130.
Hanson, B. A. &. Béguin, A. A. (2002). Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent Estimation in the Common-Item Equating Design. Applied Psychological Measurement, 26(1), 3-24.
Kim, S. H., & Cohen, A. S. (1995). A minimum method for equating tests under the graded response model. Applied Psychological Measurement, 19, 167-176.
Kim, S. H., & Cohen, A. S. (1998). A comparison of linking and concurrent calibration under item response theory. Applied Psychological Measurement, 22(2), 131-143.
Kim, S. H., & Cohen, A. S. (2002). A comparison of linking and concurrent calibration under the graded response model. Applied Psychological Measurement, 26(1), 25-40.
Kolen, M. J. & Brennan, R. L. (1995). Test equating: methods and practices. New York: Springer-Verlag.
Kolen, M. J. & Brennan, R. L. (2004). Test equating, scaling, and linking: methods and practices (2nd ed.). New York: Springer-Verlag.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
Loyd, B. H., & Hoover, H. D. (1980). Vertical equating using the Rasch model. Journal of Educational Measurement, 17, 169-194.
Linn, R. L., Levine, M. V., Hastings, C. N., & Wardrop, J. L. (1981). Item bias in a test of reading comprehension. Applied Psychological Measurement, 5, 159-173.
Marco, G. L. (1977). Item characteristic curve solutions to three intractable testing problems. Journal of Educational Measurement, 14, 139-160.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47,149-174.
Reise, S.P., & Yu, J. (1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measurement, 27, 133-144.
Stocking, M. L.,& Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201-210.
Samejima, F. (1969). Estimation of a latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 17.
Samejima, F. (1972). A general model for free response data. Psychometrika Monograph Supplement, 18.
Thissen, D. (1991). MULTILOG user’s guide: Multiple, categorical item analysis and test scoring using item response theory [Computer program]. Chicago: Scientific Software International.
Vale, C. D. (1986). Linking item parameters onto a common scale. Applied Psychological Measurement, 10, 333-344.