Model Specification in Mixed-Effects Models A Focus on Random Effects
Main Article Content
Abstract
Mixed-effect models are flexible tools for researchers in a myriad of fields, but that flexibility comes at the cost of complexity and if users are not careful in how their model is specified, they could be making faulty inferences from their data. We argue that there is significant confusion around appropriate random effects to be included in a model given the study design, with researchers generally being better at specifying the fixed effects of a model, which map onto to their research hypotheses. To that end, we present an instructive framework for evaluating the random effects of a model in three different situations: (1) longitudinal designs; (2) factorial repeated measures; and (3) when dealing with multiple sources of variance. We provide worked examples with open-access code and data in an online repository. We think this framework will be helpful for students and researchers who are new to mixed effect models, and to reviewers who may have to evaluate a novel model as part of their review.
Metrics
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided that the original work is properly cited.
References
Agresti, A. (2015). Foundations of Linear and Generalized Linear Models . Wiley.
Barr, D. J. (2013). Random effects structure for testing interactions in linear mixed-effects models. Frontiers in Psychology, 4. https://doi.org/10.3389/fpsyg.2013.00328
Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models (Version 2). arXiv. https://doi.org/10.48550/ARXIV.1506.04967
Bates, D., Machler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using {lme4} (Vol. 67, pp. 1--48). https://doi.org/10.18637/jss.v067.i01
Benjamin, D. J., & Berger, J. O. (2019). Three Recommendations for Improving the Use of p-Values. The American Statistician, 73(sup1), 186–191. https://doi.org/10.1080/00031305.2018.1543135
Bolker, B. (2023). GLMM FAQ.
Bolker, B. M., Brooks, M. E., Clark, C. J., Geange, S. W., Poulsen, J. R., Stevens, M. H. H., & White, J.-S. S. (2009). Generalized linear mixed models: a practical guide for ecology and evolution. Trends in Ecology & Evolution, 24(3), 127–135. https://doi.org/10.1016/j.tree.2008.10.008
Brown, V. A. (2021). An Introduction to Linear Mixed-Effects Modeling in R. Advances in Methods and Practices in Psychological Science, 4(1), 251524592096035. https://doi.org/10.1177/2515245920960351
Brown, V. A., & Strand, J. F. (2019, May 22). About face: Seeing the talker improves spoken word recognition but increases listening effort. Center for Open Science. https://doi.org/10.31234/osf.io/m7b8q
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum Associates Publishers.
Dunlop, D. D. (1994). Regression for Longitudinal Data: A Bridge from Least Squares Regression. The American Statistician, 48(4), 299. https://doi.org/10.2307/2684838
Efron, B., & Tibshirani, R. J. (1993). Bootstrap standard errors: some examples. In An Introduction to the Bootstrap (pp. 60–85). Springer US. https://doi.org/10.1007/978-1-4899-4541-9_7
Faraway, J. J. (2016). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC Press.
Fisher, R. A. (1919). XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. Transactions of the Royal Society of Edinburgh, 52(2), 399–433. https://doi.org/10.1017/s0080456800012163
Fox, J. (2016). Applied regression analysis and generalized linear models. SAGE.
Fox, J., & Weisburg, S. (2011). An R companion to applied regression. SAGE.
Frossard, J., & Renaud, O. (2019). Choosing the correlation structure of mixed effect models for experiments with stimuli (Version 3). arXiv. https://doi.org/10.48550/ARXIV.1903.10766
Garcia, T. P., & Marder, K. (2017). Statistical Approaches to Longitudinal Data Analysis in Neurodegenerative Diseases: Huntington’s Disease as a Model. Current Neurology and Neuroscience Reports, 17(2). https://doi.org/10.1007/s11910-017-0723-4
Gelman, A. (2005). Analysis of variance—why it is more important than ever. The Annals of Statistics, 33(1). https://doi.org/10.1214/009053604000001048
Goodman, S. N. (2019). Why is Getting Rid of P-Values So Hard? Musings on Science and Statistics. The American Statistician, 73(sup1), 26–30. https://doi.org/10.1080/00031305.2018.1558111
Gurka, M. J., Edwards, L. J., & Muller, K. E. (2011). Avoiding bias in mixed model inference for fixed effects. Statistics in Medicine, 30(22), 2696–2707. https://doi.org/10.1002/sim.4293
Hodges, J. S. (2016). Richly Parameterized Linear Models. Chapman and Hall/CRC. https://doi.org/10.1201/b16019
Imrie, R. (2004). Demystifying disability: a review of the International Classification of Functioning, Disability and Health. Sociology of Health & Illness, 26(3), 287–305. https://doi.org/10.1111/j.1467-9566.2004.00391.x
Johnson, P. C. D. (2014). Extension of Nakagawa-Schielzeth’s R2GLMM to random slopes models. Methods in Ecology and Evolution, 5(9), 944–946. https://doi.org/10.1111/2041-210x.12225
Judd, C. M., McClelland, G. H., & Ryan, C. S. (2017). Data Analysis. Routledge. https://doi.org/10.4324/9781315744131
Kauermann, G., & Carroll, R. J. (2001). A Note on the Efficiency of Sandwich Covariance Matrix Estimation. Journal of the American Statistical Association, 96(456), 1387–1396. https://doi.org/10.1198/016214501753382309
Kenny, D. A., Korchmaros, J. D., & Bolger, N. (2003). Lower level mediation in multilevel models. Psychological Methods, 8(2), 115–128. https://doi.org/10.1037/1082-989x.8.2.115
Kenward, M. G., & Roger, J. H. (1997). Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood. Biometrics, 53(3), 983. https://doi.org/10.2307/2533558
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). {lmerTest} Package: Tests in Linear Mixed Effects Models (Vol. 82, pp. 1--26). https://doi.org/10.18637/jss.v082.i13
Lawrence, M. A. (2016). ez: Easy Analysis and Visualization of Factorial Experiments. https://CRAN.R-project.org/package=ez
Lohse, K., Shen, J., & Kozlowski, A. J. (2020, January 29). Modeling Longitudinal Outcomes: A Contrast of Two Methods. Center for Open Science. https://doi.org/10.31236/osf.io/b4uev
Long, J. D. (2012). Longitudinal Data Analysis for the Behavioral Sciences Using R . SAGE.
Mason, F., Cantoni, E., & Ghisletta, P. (2021). Parametric and semi-parametric bootstrap-based confidence intervals for robust linear mixed models. Methodology, 17(4), 271–295. https://doi.org/10.5964/meth.6607
McLean, R. A., Sanders, W. L., & Stroup, W. W. (1991). A Unified Approach to Mixed Linear Models. The American Statistician, 45(1), 54. https://doi.org/10.2307/2685241
McNeish, D. (2017). Small Sample Methods for Multilevel Modeling: A Colloquial Elucidation of REML and the Kenward-Roger Correction. Multivariate Behavioral Research, 52(5), 661–670. https://doi.org/10.1080/00273171.2017.1344538
Nakagawa, S., & Schielzeth, H. (2012). A general and simple method for obtaining R2 from generalized linear mixed‐effects models. Methods in Ecology and Evolution, 4(2), 133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x
Nelder, J. A. (2007). What is the Mixed-Models Controversy? International Statistical Review, 0(0), 071121035909002–??? https://doi.org/10.1111/j.1751-5823.2007.00022.x
Pinheiro, J., & Bates, D. (2006). Mixed-effects models in S and S-PLUS. Springer.
Pinheiro, J., Bates, D., & R Core Team. (2022). nlme: Linear and Nonlinear Mixed Effects Models. https://CRAN.R-project.org/package=nlme
Preacher, K. J., Zyphur, M. J., & Zhang, Z. (2010). A general multilevel SEM framework for assessing multilevel mediation. Psychological Methods, 15(3), 209–233. https://doi.org/10.1037/a0020141
R Core Team. (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models : applications and data analysis methods. SAGE.
Robinson, G. K. (1991). That BLUP is a Good Thing: The Estimation of Random Effects. Statistical Science, 6(1). https://doi.org/10.1214/ss/1177011926
Sainani, K. (2010). The Importance of Accounting for Correlated Observations. PMR, 2(9), 858–861. https://doi.org/10.1016/j.pmrj.2010.07.482
Satterthwaite, F. E. (1941). Synthesis of variance. Psychometrika, 6(5), 309–316. https://doi.org/10.1007/bf02288586
Senn, S. (2003). A Conversation with John Nelder. Statistical Science, 18(1). https://doi.org/10.1214/ss/1056397489
Sera, F., Armstrong, B., Blangiardo, M., & Gasparrini, A. (2019). An extended mixed‐effects framework for meta‐analysis. Statistics in Medicine, 38(29), 5429–5444. https://doi.org/10.1002/sim.8362
Silk, M. J., Harrison, X. A., & Hodgson, D. J. (2020). Perils and pitfalls of mixed-effects regression models in biology. PeerJ, 8, e9522. https://doi.org/10.7717/peerj.9522
Singer, J. D., & Willett, J. B. (2003). Exploring Longitudinal Data on Change. In Applied Longitudinal Data Analysis (pp. 16–44). Oxford University PressNew York. https://doi.org/10.1093/acprof:oso/9780195152968.003.0002
Singmann, H., & Kellen, D. (2019). An Introduction to Mixed Models for Experimental Psychology. In New Methods in Cognitive Psychology (pp. 4–31). Routledge. https://doi.org/10.4324/9780429318405-2
Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: an introduction to basic and advanced multilevel modeling. SAGE.
Üstun, T. B., Chatterji, S., Bickenbach, J., Kostanjsek, N., & Schneider, M. (2003). The International Classification of Functioning, Disability and Health: a new tool for understanding disability and health. Disability and Rehabilitation, 25(11-12), 565–571. https://doi.org/10.1080/0963828031000137063
Van der Elst, W., Molenberghs, G., Hilgers, R., Verbeke, G., & Heussen, N. (2016). Estimating the reliability of repeatedly measured endpoints based on linear mixed‐effects models. A tutorial. Pharmaceutical Statistics, 15(6), 486–493. https://doi.org/10.1002/pst.1787
Venables, W. N., & Ripley, B. D. (1997). Modern Applied Statistics with S-PLUS. In Statistics and Computing. Springer New York. https://doi.org/10.1007/978-1-4757-2719-7
Voss, D. T. (1999). Resolving the Mixed Models Controversy. The American Statistician, 53(4), 352. https://doi.org/10.2307/2686056
Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a World Beyond P < 0.05. The American Statistician, 73(sup1), 1–19. https://doi.org/10.1080/00031305.2019.1583913
Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, 143(5), 2020–2045. https://doi.org/10.1037/xge0000014