Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Brown, Anna; Croudace, Tim J (2015)
Publisher: Taylor & Francis (Routledge)
Languages: English
Types: Part of book or chapter of book
Subjects: BF
The ultimate goal of measurement is to produce a score by which individuals can be assessed and differentiated. Item response theory (IRT) modeling views responses to test items as indicators of a respondent’s standing on some underlying psychological attributes (van der Linden & Hambleton, 1997) – we often call them latent traits – and devises special algorithms for estimating this standing. This chapter gives an overview of methods for estimating person attribute scores using one-dimensional and multi-dimensional IRT models, focusing on those that are particularly useful with patient-reported outcome (PRO) measures. \ud To be useful in applications, a test score has to approximate the latent trait well, and importantly, the precision level must be known in order to produce information for decision-making purposes. Unlike classical test theory (CTT), which assumes the precision with which a test measures the same for all trait levels, IRT methods assess the precision with which a test measures at different trait levels. In the context of patient-reported outcomes measurement, this enables assessment of the measurement precision for an individual patient. Knowing error bands around the patient’s score is important for informing clinical judgments, such as deciding upon significance of any change, for instance in response to treatment etc. (Reise & Haviland, 2005). At the same time, summary indices are often needed to summarize the overall precision of measurement in a research sample, population group, or in the population as a whole. Much of this chapter is devoted to methods for estimating measurement precision, including the score-dependent standard error of measurement and appropriate sample-level or population-level marginal reliability coefficients.\ud Patient-reported outcome measures often capture several related constructs, the feature that may make the use of multi-dimensional IRT models appropriate and beneficial (Gibbons, Immekus & Bock, 2007). Several such models are described, including a model with multiple correlated constructs, a model where multiple constructs are underlain by a general common factor (second-order model), and a model where each item is influenced by one general and one group factor (bifactor model). To make the use of these models more easily accessible for applied researchers, we provide specialized formulae for computing test information, standard errors and reliability. We show how to translate a multitude of numbers and graphs conditioned on several dimensions into easy-to-use indices that can be understood by applied researchers and test users alike. All described methods and techniques are illustrated with a single data analysis example involving a popular PRO measure, the 28-item version of the General Health Questionnaire (GHQ28; Goldberg & Williams, 1988), completed in mid-life by a large community sample as a part of a major UK cohort study.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • Ackerman, T.A. (2005). Multidimensional item response theory modeling. In A. MaydeuOlivares & J. J. McArdle. (Eds.).Contemporary Psychometrics (pp. 3-26). Mahwah, NJ: Lawrence Erlbaum.
    • Bock, R.D. (1975). Multivariate statistical methods in behavioral research. New York: McGraw-Hill.
    • Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., & Wood, R. (2003). TESTFACT 4.0 user's guide. Chicago, IL: Scientific Software International.
    • Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss, D.J., Segawa, E., Bhaumik, D.K., Kupfer, D.J., Frank, E., Grochocinski, V.J. & Stover, A. (2007). Full-Information Item Bifactor Analysis of Graded Response Data. Applied Psychological Measurement, 31, 4–19.
    • Gibbons, R.D., Immekus, J.C. & Bock, R.D. (2007). The Added Value of Multidimensional IRT Models. Didactic workbook. Retrieved on 1 June 2011 from http://outcomes.cancer.gov/areas/measurement/multidimensional_irt_models.pdf
    • Brown, A. & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460-502.
    • Croudace, T.J., Evans, J., Harrsion, G., Sharp, D.J., Wilkinson, E., McCann, G., Spence, M., Crilly, C. & Brindle, L. (2003). Impact of the ICD-10 Primary Health Care (PHC) diagnostic and management guidelines for mental disorders on detection and outcome in primary care: cluster randomised controlled trial. British Journal of Psychiatry, 182, 20-30.
    • Dodd, B.G., De Ayala, R.J. & Koch, W.R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Methods, 19, 5-22.
    • Du Toit, M. (Ed.). (2003). IRT from SSI. Chicago: Scientific Software International.
    • Embretson, S. E. & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum Publishers.
    • Fayers, P.M. & Machin, D. (2007). Quality of Life: The assessment, analysis and interpretation of patient-reported outcomes. Second edition. John Wiley & Sons.
    • Fisher, R.A. (1921). On the mathematical foundations of theoretical statistics. Philosophical transactions A, 222, 309-368.
    • Green, B.F., Bock, R., Humphreys, L.G., Linn, R.L. & Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347-360.
    • Goldberg, D. P. (1972). The detection of psychiatric illness by questionnaire. Oxford University Press: London.
    • Goldberg, D. P., Gater, R., Sartorius, N., Ustun, T. B., Piccinelli, M., Gureje, O., & Rutter, C. (1997). The validity of two versions of the GHQ in the WHO study of mental illness in general health care. Psychological Medicine, 27, 191-197.
    • Goldberg, D. P., & Hillier, V. F. (1979). A scaled version of the General Health Questionnaire. Psychological Medicine, 9, 139-145.
    • Goldberg, D.P. & Williams, P. (1988). A user's guide to the general health questionnaire. NFER Nelson: Windsor.
    • McDonald, R.P. (1999). Test theory. A unified approach. Mahwah, NJ: Lawrence Erlbaum.
    • McDonald, R. P. (2011). Measuring Latent Quantities. Psychometrika, 76 (4), 511-536.
    • Masters, G.N. & Wright, B.D. (1997). The partial credit model. In W.J. van der Linden and R. Hambleton (Eds): Handbook of modern item response theory. New York: Springer-Verlag.
    • Muthén, B. O. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen, & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage.
    • Muthén, B., du Toit, S.H.C. & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Unpublished manuscript. College of Education, UCLA. Los Angeles, CA.
    • Muthén, L.K. & Muthén, B.O. (1998-2010). Mplus User's guide. Sixth edition. Los Angeles, CA: Muthén & Muthén.
    • Reckase, M. (2009). Multidimensional Item Response Theory. New York, NY: Springer.
    • Reise, S. & Haviland, M. (2005). Item response theory and the measurement of clinical change. Journal of Personality Assessment, 84(3), 228-238.
    • Rindskopf, D. & Rose, T. (1988). Some theory and applications of confirmatory second-order factor analysis. Multivariate Behavioral Research, 23, 51–67.
    • Samejima, F. (1969). Estimation of Latent Ability Using a Response Pattern of Graded Scores (Psychometric Monograph No. 17). Richmond, VA: Psychometric Society. Retrieved from http://www.psychometrika.org/journal/online/MN17.pdf
    • Thissen, D. & Orlando, M. (2001). Item response theory for items scored in two categories. In D. Thissen & H. Wainer (Eds.), Test Scoring. Mahwah, NJ: Lawrence Erlbaum.
    • Van der Linden, W.J. & Hambleton, R. (1997). Item Response Theory: brief history, common models and extensions. In W.J. van der Linden and R. Hambleton (Eds), Handbook of modern item response theory. New York: Springer-Verlag.
    • Wadsworth M.E., Butterworth, S.L., Hardy, R.J., Kuh, D.J., Richards, M., Langenberg, C., Hilder, W.S. & Connor, M. (2003). The life course prospective design: an example of benefits and problems associated with study longevity. Social Science and Medicine, 57, 2193-2205.
  • No related research data.
  • No similar publications.

Share - Bookmark

Download from

Funded by projects

  • WT | Genetic investigation of lif...

Cite this article