# From Descriptive to Repeated Measures Data….one small step for studies, one giant leap for (bio)statistics

Traditional epidemiological descriptive studies, also called cross-sectional  studies,  have been characterized for reporting population health, describing the existing distribution of the collected exposure factors, variables, without relating to other hypotheses.  In other words, they should try to give an answer to three basic “W” questions: who, where and when. Most important uses of this kind of research include health planning and hypothesis generation. Nonetheless, the most important pitfall is that researchers might draw causal inferences when developing this type of studies. Temporal associations between the effects and the outcomes of interest might be unclear. Thus, when a researcher wants to verify the causality effect between two variables, a more appropriate design is highly recommended, such as a study with two or more observations per subject collected over the established research period. The latter design corresponds to repeated measurement data structure, more specifically, to a longitudinal data analysis (a common repeated analysis form in which measurements are recorded on individual subjects over time).

As mentioned in the previous paragraph, the main difference between both research study designs, cross-sectional and longitudinal, is that each experimental unit participating in the first one is observed only once, so for each exposure factor one has only one value per subject. In other words,  each row in the dataset is an observation. However, in longitudinal data each subject is observed  more than once.

It is also worth pointing out an increase in the complexity of the statistical approaches when moving from descriptive analysis to repeated data studies. For instance, in the first setting the statistical methods in use are the simplest ones: mean and percentage comparisons by means of classical tests, regression analysis, etc…However, in repeated measures data sets, and specifically in longitudinal data analysis, is required to  use special statistical techniques for valid analysis and inference. Thus, researchers should be aware of three important points to perform a proper statistical model, in this order:  (1) the trend of the temporal component; (2) the variance-covariance structure; (3) the mean structure. More accurately, the overall trend of the evolutive analysis should be guessed first of all. Temporal trends can follow a linear, quadratic, cubic or even a fourth grade polynomial function. Besides, as observations in the same subject are more likely to be correlated, repeated measures analysis must account for this correlation (the within and between-subject effects must be controlled). Among the possible covariance structures, compound symmetry, unstructured  and  first-order autoregressive  are the most used.  As for the mean structure, the potential exposure factors which could be related with the dependent variable should be included in the model.

Longitudinal studies play an important key role, mostly in epidemiology and clinical research. They are used to determine the change in the outcome of measurement or to evaluate the effectiveness of a new treatment in a clinical trial, among other applicable settings. Under these scenarios, due to the complexity of the statistical analyses,  longitudinal studies involve a great deal of effort, but they offer several benefits. The most importants, from my point of view, are the following: (1) The ability to measure change in outcomes and/or exposure at the individual level, so that the researcher has the opportunity to observe individual patterns of change; (2) the temporal order of the exposure factors and the outcomes is measured. Therefore, the timing of the outcome onset can be correlated with the covariates.

Finally, there is no specific statistical package to perform this kind of analyses. Nowadays, most of them include on their recent releases all the procedures to perform, at least, a basic longitudinal analysis. Now, there is no excuse for identifying a repeated/longitudinal analysis from a descriptive one, and developing them without any doubt…. Do you agree??