Featured

# Statistical or Clinical Significance… that is the question!

Most of the times, results coming from a research project – specifically in the health sciences field – use statistical significance to show differences or associations among groups in the variables of interest. Setting up the null hypothesis as no difference between groups and the alternative showing just the opposite –i.e, there is a relationship between the analyzed factors –, and after performing the required statistical method, a p-value is provided. This p-value indicates, under an established threshold of significance (say, Type I or alpha error), the strength of the evidence against the null hypothesis. If the p-value is lower than alpha, results lead to a statistically significant conclusion; otherwise, there is no statistical significance.

According to my personal and other biostatisticians’ experience in the medical area, most of physicians are only interested in the statistical significance of their main objectives. They only want to know whether the p-value is below alpha. But, the p-value, as noted in the previous paragraph, gives limited information: essentially, significance versus no significance and it does not show how important the result of the statistical analysis is. Besides from significance, confidence intervals (CI) and measures of effect sizes (i.e., the magnitude of the change) should be also included in the research findings, as they can provide more information regarding the magnitude of the relationship of the studied variables (e.g., changes after an intervention, differences between groups,…). For instance, CIs facilitate the range of values within the true difference value of the studied parameter lies.

In clinical research is not only important to assess the significance of the differences between the evaluated groups but also it is recommended, if possible, to measure how meaningful the outcome is (for instance, to evaluate the effectiveness and efficacy of an intervention). Statistical significance does not provide information about the effect size or the clinical relevance. Because of that, researchers often misinterpret statistically significance as clinical one. On one hand, a large sample size study may have a statistically significant result but a small effect size. Outcomes with small p-values are often misunderstood as having strong effect sizes. On the other hand, another misinterpretation is present when non statistical significant difference could lead to a large effect size but a small sample may not have enough power to reveal that effect.
Some methods to determine clinical relevance have been developed: Cohen’s effect size, the minimal important difference (MID) and so on. In this post I will show how to calculate Cohen’s effect size (ES) [1], which is the easiest one.

ES provides information regarding the magnitude of the association between variables as well as the size of the difference of the groups. To compute ES, two mean scores (one from each group) and the pooled standard deviation of the groups are needed. The mathematical expression is the following:

$ES = \frac{\overline{X}_{G1}-\overline{X}_{G2}}{SD_{pooled}}$

where $X_{G1}$ = mean of the group G1; $X_{G2}$ = mean of the group G2; and $SD_{pooled}$ is the pooled standard deviation which follows the next formula:

$SD_{pooled} = \sqrt{\frac{s^2_{1}n_{1}+s^2_{2}n_{2}}{n_{1}+n_{2}-2}}$

being $n_{1}$ = sample size for G1; $n_{2}$ = sample size for G2; $s_{1}$ = the standard deviation of G1; $s_{2}$ = the standard deviation of G2;

But, how can it be interpreted? Firstly, it can be understood as an index of clinical relevance. The larger the effect size, the larger the difference between groups and the larger the clinical relevance of the results. As it is a quantitative value, ES can be described as small, medium and large effect size using the cut-off values of 0.2, 0.5 and 0.80.
Clinical relevance is commonly assessed as a result of an intervention. Nevertheless, it can be also extended to any other non experimental study design types, for instance, for cross-sectional studies.
To sum up, both significances (statistical and clinical) are not mutually exclusive but complementary in reporting results of clinical research. Researchers should abandon the only use of the p-value interpretation. Here you have a starting point for the evaluation of the clinical relevance.

[1] Cohen J. The concepts of power analysis. In: Cohen J. editor: Statistical power analysis for the behavioral sciences. Hillsdale, New Jersey: Academic Press, Inc: 1998. p. 1-17.

Featured

# Interview with…..

Arantzazu Arrospide

Arantzazu Arrospide Elgarresta studied mathematics in the University of the Basque Country (UPV/EHU) and works as a biostatistician in the Research Unit of Integrated Health Organisations in Gipuzkoa. This research unit gives support to four regional hospitals (about 100 beds each one) and all the public Primary Care Health Services in Gipuzkoa.

Email: arantzazu.arrospideelgarresta@osakidetza.net

and

Irantzu Barrio

Acting teacher at the Department of Applied  Mathematics, Statistics and Operational Research of the    University of the Basque Country (UPV/EHU)

Email: irantzu.barrio@ehu.es

Both young biostatisticians are currently working on several ongoing research projects. They belong to the Health Services Research on Chronic Patients Network (REDISSEC) – among others  biostatisticians – and tell us what they think about Biostatistics.

1.    Why do you like Biostatistics?

Irantzu Barrio: On one hand I like applying statistics to real problems, data sets and experiments. On the other hand, I like developing methodology which can contribute to get better results and conclusions in each research project. In addition, I feel lucky  to work in multidisciplinary teams. This allows me to learn a lot from other areas and constantly improve on mine own, always looking for ways to provide solutions to other researchers needs.

Arantzazu Arrospide: I think Biostatistics is the link between mathematics and the real world, giving us the opportunity to feel part of advances in scientific research.

2.    Could you give us some insight in your current field of research?

AA: Our main research line is the application of mathematical modeling the evaluation of public health interventions, especially economic evaluations. Although Markov Chain models are the most common methods for this kind of evaluations we work with discrete event simulation models which permit more flexible and complex modeling.

IB: I’m currently working on my PhD thesis. One of the main objectives of this work is to propose and validate a methodology to categorize continuous predictor variables in clinical prediction model framework. Specifically we have worked on logistic regression models and Survival Models.

3.    You have been doing an internship abroad. What was the aim of your stay?

IB: I did an internship in Guimaraes at the University of Minho, Portugal. During my stay, I worked together with Luis Filipe Meira Machado and María Xosé Rodriguez-Alvarez. The aim was to learn more about survival models and extend the methodology developed so far, considering different prediction models.

AA: I did a short stay in the Public Health department of the Erasmus Medical Centre in Rotterdam (Netherlands) last November. The aim of the visit was to discuss the validation of a discrete event simulation model developed to estimate the health effects and costs of the breast cancer screening program in the Basque Country.

4.    What did allow you to do that was has not been possible in Spain?

IB: Oh! It’s amazing when you realize you have all your time to work on your research project, one and a unique priority for more than two months. Of course, all the other to do’s did not disappeared from my calendar, only were postponed until my return to Bilbao. And, in addition to that, it was also a privilege to work together with high experienced biostatisticians and to have the opportunity to learn a lot from them.

AA: The research group I visited, internationally known as the MISCAN group, is the only European member of the Cancer Intervention and Surveillance Modeling Network (CISNET) created by the National Cancer Institute in the United States. Their main objective is to include modeling to improve the understanding of the impact of cancer control interventions on population trends in incidence and mortality. These models then can project future trends and help determine optimal control strategies. Currently, Spanish screening programs evaluation is mainly based on the quality indicators recommended by the European Screening Guidelines which do not include a comparison with an hypothetical or estimated control group.

5.    Which are the most valuable aspects to highlight during your internship? What aspects do you believe that might be improved?

IB: I would say that my internship was simply perfect. When I came back to Bilbao I just thought time had gone really really fast. I’m just looking forward to go back again.

AA: This group works for and in collaboration with their institutions. They are the main responsible of evaluation of ongoing screening programs, prospective evaluation of screening strategies and leaders for new randomized trials in this topic. This is the reference group in the Netherlands for cancer screening interventions and their institutions consider their conclusions when making important decisions.

6.    What do you think of the situation of young biostatisticians in Spain?

AA: When you work in a multidisciplinary research group both methodological and disease specific knowledge are essential and it takes a long time to achieve it. Institutional support is necessary to obtain long term funds that would ensure future benefits in healthcare research based on rigorous and innovative methods.

IB: I think the situations for young biostatisticians and for young people in general is not easy right now. And at least for what I see around me, there is lot of work to do for.

7.    What would be the 3 main characteristics or skills you would use to describe a good biostatistician? And the main qualities for a good mentor?

AA: Open minded, perfectionist and enthusiastic. As for the mentor, he/she  should be strict, committed and patient.

IB: In my opinion good skills on statistics, probability and mathematics are needed. But at the same time I think it is important to be able to communicate with other researchers such as clinicians, biologists, etc, specially to understand which are their research objectives and be able to translate bio-problems to stat-problems.

For me it is very important to have good feeling and confidence with your mentor. I think that having that, everything else is much easier. On the other hand, if I had to highlight some qualities, I would say that a good mentor would: 1) Contribute with suggestions and ideas 2) Supervise the work done and 3) be a good motivator.

8.    Finally, is there any topic you would like to see covered in the blog?

IB: I think the blog is fantastic, there is nothing I missed in it. I would like to congratulate all the organizing team, you are doing such a good job!!! Congratulations!!!

AA: Although it is not considered part of statistical science operational research methods also can be of interest in our researches.

Selected publications (6):

Arrospide, A., C. Forne, M. Rue, N. Tora, J. Mar, and M. Bare. “An Assessment of Existing Models for Individualized Breast Cancer Risk Estimation in a Screening Program in Spain.”. BMC Cancer 13 (2013).

Barrio, I., Arostegui, I., & Quintana, J. M. (2013). Use of generalised additive models to categorise continuous variables in clinical prediction. BMC medical research methodology13(1), 83.

Vidal, S., González, N., Barrio, I., Rivas-Ruiz, F., Baré, M., Blasco, J. A., … & Investigación en Resultados y Servicios Sanitarios (IRYSS) COPD Group. (2013). Predictors of hospital admission in exacerbations of chronic obstructive pulmonary disease. The International Journal of Tuberculosis and Lung Disease17(12), 1632-1637.

Quintana, J. M., Esteban, C., Barrio, I., Garcia-Gutierrez, S., Gonzalez, N., Arostegui, I., Vidal, S. (2011). The IRYSS-COPD appropriateness study: objectives, methodology, and description of the prospective cohort. BMC health services research11(1), 322.

Mar, J., A. Arrospide, and M. Comas. “Budget Impact Analysis of Thrombolysis for Stroke in Spain: A Discrete Event Simulation Model.”. Value Health 13, no. 1 (2010): 69-76.

Rue, M., M. Carles, E. Vilaprinyo, R. Pla, M. Martinez-Alonso, C. Forne, A. Roso, and A. Arrospide. “How to Optimize Population Screening Programs for Breast Cancer Using Mathematical Models.”.

Featured

# Mixed Models in Sports and Health

As in many other research areas, mixed models have become widely applied in sports science and related health issues. Within sports science, different branches account for the interests of different stakeholders; general managers, coaches, teams, supporters, scientists, academics. Human performance and medicine (treatment and prevention of injuries) lie behind them all and these models are a way to account for within subject variability, time-dependent covariates, and nested data.

On the competition side, efforts have been made in the literature to find ways to model player performance. Casals and Martínez (2013) approach this problem in the context of the NBA basketball league results by considering a balanced design where player is the random intercept and the forward stepwise approach to model selection by Pinheiro and Bates (2000) has been adopted to determine additional factors to be included in the final model. The great advantage of their proposed models for points and win scores over previous studies is that they account for the variation in players´ characteristics and therefore can predict potential variation in player performance using Linear Mixed Models (LMM) and Generalized Linear Mixed models (GLMM), and are consequently of great help for coaches, managers and other decision makers.

A similar methodology has been followed to predict scoring ability in the English football Premier League by McHale and Szczepański (2014). You may recall the post by Hèctor Perpiñán on calculating results probabilities via simulation. While in Perpiñán´s study only teams´ results were considered for the calculations, McHale and Szczepański´s mixed modelling approach allows for the inclusion of players´ ability to create and convert shots as a disaggregated factor from chance. The accuracy in the prediction of their models in this paper shows again the relevance of models that allow the inclusion of players´ characteristics (rather than just teams´).

Of particular note to our readers is the trend towards the implementation of mixed models in open source software exemplified in the aforementioned papers which use R (R Core Team, 2012) for their modelling, in particular packages nlme and lme4

In community exercises for promoting physical activity like the one described in Okely et al. (2011), one research aim has been to determine policies and interventions that help to encourage exercising during adolescence in order to alleviate risk factors in the long run. This particular school-based randomised controlled trial used mixed models to account for the hierarchical structure of the data (32 schools from four geographical regions). According to the authors, one of the greatest advantages of the methodology is that it “incorporates all available data allowing for the analysis of partial datasets created when a participant drops out of the study or misses a study visit.”

Moving on to health applications, injury prediction, for instance in baseball pitchers and athletes, can be modelled by deploying similar approaches to determine the existence of statistically significant differences between groups and within days post-injury for example.

Finally, in veterinary epidemiology mixed modelling has become equally mainstream, as discussed in a recent article by Stryhn and Christensen (2014). As an example of an application, animal risk factors associated with health disorders in sport can also be modelled via these techniques. Visser et al. (2013) have studied the effect in race horses in the Netherlands via a cross-sectional study and have considered a random farm effect. Commercial software has been used in these two previous examples. – SAS (PROC MIXED) and GenStat (GLMM)- which are again of common use in the application of mixed models.

As stated by Casals and Martínez (2013), phenomena like Moneyball have raised the profile of sports data analysis. For researchers, big data and more widely, the intrinsic complex structure of the data coming from the aforementioned fields –and I would add software availability- have stimulated application of these types of models and they seem to be here to stay…

These are some of the examples that we have found but we will sure be missing some other very interesting ones so please let us know…Are you a researcher in the area and would like to tell us about your experience? Have you used this sort of model in this or other areas and are you willing to share your experience?

Main references:

Casals, M., & Martinez, J.A. (2013). Modelling player performance in basketball through mixed models Journal of Performance Analysis in Sport, 13 (1), 64-82

McHale, I., & Szczepański, L. (2014). A mixed effects model for identifying goal scoring ability of footballers Journal of the Royal Statistical Society: Series A (Statistics in Society), 177 (2), 397-417 DOI: 10.1111/rssa.12015

Okely, A., Cotton, W., Lubans, D., Morgan, P., Puglisi, L., Miller, J., Wright, J., Batterham, M., Peralta, L., & Perry, J. (2011). A school-based intervention to promote physical activity among adolescent girls: Rationale, design, and baseline data from the Girls in Sport group randomised controlled trial BMC Public Health, 11 (1) DOI: 10.1186/1471-2458-11-658

Visser, E., Neijenhuis, F., de Graaf-Roelfsema, E., Wesselink, H., de Boer, J., van Wijhe-Kiezebrink, M., Engel, B., & van Reenen, C. (2013). Risk factors associated with health disorders in sport and leisure horses in the Netherlands Journal of Animal Science, 92 (2), 844-855 DOI: 10.2527/jas.2013-6692