As in many other research areas, mixed models have become widely applied in sports science and related health issues. Within sports science, different branches account for the interests of different stakeholders; general managers, coaches, teams, supporters, scientists, academics. Human performance and medicine (treatment and prevention of injuries) lie behind them all and these models are a way to account for within subject variability, time-dependent covariates, and nested data.
On the competition side, efforts have been made in the literature to find ways to model player performance. Casals and Martínez (2013) approach this problem in the context of the NBA basketball league results by considering a balanced design where player is the random intercept and the forward stepwise approach to model selection by Pinheiro and Bates (2000) has been adopted to determine additional factors to be included in the final model. The great advantage of their proposed models for points and win scores over previous studies is that they account for the variation in players´ characteristics and therefore can predict potential variation in player performance using Linear Mixed Models (LMM) and Generalized Linear Mixed models (GLMM), and are consequently of great help for coaches, managers and other decision makers.
A similar methodology has been followed to predict scoring ability in the English football Premier League by McHale and Szczepański (2014). You may recall the post by Hèctor Perpiñán on calculating results probabilities via simulation. While in Perpiñán´s study only teams´ results were considered for the calculations, McHale and Szczepański´s mixed modelling approach allows for the inclusion of players´ ability to create and convert shots as a disaggregated factor from chance. The accuracy in the prediction of their models in this paper shows again the relevance of models that allow the inclusion of players´ characteristics (rather than just teams´).
Of particular note to our readers is the trend towards the implementation of mixed models in open source software exemplified in the aforementioned papers which use R (R Core Team, 2012) for their modelling, in particular packages nlme and lme4.
In community exercises for promoting physical activity like the one described in Okely et al. (2011), one research aim has been to determine policies and interventions that help to encourage exercising during adolescence in order to alleviate risk factors in the long run. This particular school-based randomised controlled trial used mixed models to account for the hierarchical structure of the data (32 schools from four geographical regions). According to the authors, one of the greatest advantages of the methodology is that it “incorporates all available data allowing for the analysis of partial datasets created when a participant drops out of the study or misses a study visit.”
Moving on to health applications, injury prediction, for instance in baseball pitchers and athletes, can be modelled by deploying similar approaches to determine the existence of statistically significant differences between groups and within days post-injury for example.
Finally, in veterinary epidemiology mixed modelling has become equally mainstream, as discussed in a recent article by Stryhn and Christensen (2014). As an example of an application, animal risk factors associated with health disorders in sport can also be modelled via these techniques. Visser et al. (2013) have studied the effect in race horses in the Netherlands via a cross-sectional study and have considered a random farm effect. Commercial software has been used in these two previous examples. – SAS (PROC MIXED) and GenStat (GLMM)- which are again of common use in the application of mixed models.
As stated by Casals and Martínez (2013), phenomena like Moneyball have raised the profile of sports data analysis. For researchers, big data and more widely, the intrinsic complex structure of the data coming from the aforementioned fields –and I would add software availability- have stimulated application of these types of models and they seem to be here to stay…
These are some of the examples that we have found but we will sure be missing some other very interesting ones so please let us know…Are you a researcher in the area and would like to tell us about your experience? Have you used this sort of model in this or other areas and are you willing to share your experience?
Casals, M., & Martinez, J.A. (2013). Modelling player performance in basketball through mixed models Journal of Performance Analysis in Sport, 13 (1), 64-82
McHale, I., & Szczepański, L. (2014). A mixed effects model for identifying goal scoring ability of footballers Journal of the Royal Statistical Society: Series A (Statistics in Society), 177 (2), 397-417 DOI: 10.1111/rssa.12015
Okely, A., Cotton, W., Lubans, D., Morgan, P., Puglisi, L., Miller, J., Wright, J., Batterham, M., Peralta, L., & Perry, J. (2011). A school-based intervention to promote physical activity among adolescent girls: Rationale, design, and baseline data from the Girls in Sport group randomised controlled trial BMC Public Health, 11 (1) DOI: 10.1186/1471-2458-11-658
Visser, E., Neijenhuis, F., de Graaf-Roelfsema, E., Wesselink, H., de Boer, J., van Wijhe-Kiezebrink, M., Engel, B., & van Reenen, C. (2013). Risk factors associated with health disorders in sport and leisure horses in the Netherlands Journal of Animal Science, 92 (2), 844-855 DOI: 10.2527/jas.2013-6692