Number of the records: 1
Assessing Inter-rater Reliability With Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables
- 1.0568726 - ÚI 2024 RIV US eng J - Journal Article
Martinková, Patrícia - Bartoš, František - Brabec, Marek
Assessing Inter-rater Reliability With Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables.
Journal of Educational and Behavioral Statistics. Roč. 48, č. 3 (2023), s. 349-383. ISSN 1076-9986. E-ISSN 1935-1054
R&D Projects: GA ČR(CZ) GA21-03658S
Grant - others:GA MŠk(CZ) LM2015042
Institutional support: RVO:67985807
Keywords : Bayesian inference * inter-rater reliability * mixed-effect models * heterogeneous variance components * grant peer review
OECD category: Statistics and probability
Impact factor: 1.9, year: 2023 ; AIS: 1.688, rok: 2023
Method of publishing: Limited access
Result website:
https://dx.doi.org/10.3102/10769986221150517DOI: https://doi.org/10.3102/10769986221150517
Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater’s or ratee’s gender, major, or experience. Identification of such heterogeneity sources in IRR is important for the implementation of policies with the potential to decrease measurement error and to increase IRR by focusing on the most relevant subgroups. In this study, we propose a flexible approach for assessing IRR in cases of heterogeneity due to covariates by directly modeling differences in variance components. We use Bayes factors (BFs) to select the best performing model, and we suggest using Bayesian model averaging as an alternative approach for obtaining IRR and variance component estimates, allowing us to account for model uncertainty. We use inclusion BFs considering the whole model space to provide evidence for or against differences in variance components due to covariates. The proposed method is compared with other Bayesian and frequentist approaches in a simulation study, and we demonstrate its superiority in some situations. Finally, we provide real data examples from grant proposal peer review, demonstrating the usefulness of this method and its flexibility in the generalization of more complex designs.
Permanent Link: https://hdl.handle.net/11104/0339989
Number of the records: 1