María Álvarez Hernández, BSc in Mathematics (University of Salamanca), is a PhD student in Statistics and Operations Research at the University of Granada, where she works with Professor Martín Andrés. Her line of research is framed within the statistical analysis of categorical data from contingency tables. Contact María
One of the common objectives of Health Sciences is to compare the proportions of individuals with a feature of interest in two different populations, for which purpose it is usual to take two independent samples. This is the case of comparing the proportion of cures with two different treatments, or the proportion of patients in the groups with and without a particular risk factor. In such situations, the parameter of interest is the difference between two proportions, but in the field of Medicine the parameter of interest is usually the ratio of two proportions. Examples about this are clinical trials which evaluate the effectiveness of a new vaccine, studies for comparing two binary diagnostic methods, studies of the comparison of two different treatments, etc.
From an exact point of view, getting a confidence interval for R is computationally very intensive, it requires specific computer programmes and it isn’t feasible for moderately large sample sizes (Reiczigel et al., 2008). Hence researchers have devoted a great attention to how to obtain approximate confidence intervals and, although many different procedures have been proposed, these have not always been compared. Nowadays, there is a general consensus that the best procedure is the score method proposed by Koopman (1984) and by Miettinen and Nurminen (1985). Alternatively, other simpler methods have been proposed which work more or less well (Farrington and Manning, 1990; Dann and Koch, 2005; Zou and Donner, 2008).
One piece of research in which I am involved is to improve these methods and to suggest new ones that will allow us to achieve a result closer to the exact one, without losing rigor in the process (Martín and Álvarez, 2012). But although the improvement may be in a theoretical level, what happens in the computing scene?
From a practical point of view, obtaining confidence intervals for the relative risk through statistical packages such as SPSS20, Stata12 or StatXact10, also focuses on the asymptotic case, although in some of them, the researcher can actually obtain the exact confidence interval (in some situations incurring a long computational time). In general, the methods used are based on the ideas of Miettinen & Nurminen (1985) where it is assumed a standard normal distribution, Katz et al (1978) who applied the logarithmic transformation, and Koopman (1984) with the reputed score method. Sometimes, as it is the case of the StaXact software, it is allowed to apply the Berger & Boos correction because it reduces conservatism (it would result in shorter confidence intervals).
The aim must be not only to obtain the best methods in a theoretical way but also those that are more feasible when we carry out the explicit calculation and that involve shorter computational times.
Therefore, although the theory evolves, programmed routines in statistical packages to make inferences, for example about a measure of association like the relative risk, have not kept the pace like other techniques, considering that for the Health sector is a priority case.
In short, we should not be content with the implemented procedures and will spare no effort on research resources that allow us to improve them quickly and easily.