Biostatistics as a science is a subdiscipline of Statistics which studies the patterns behind biological processes (e.g., the spread of a disease). Scientists use different methods – from standard statistical methods to complex models – to analyze huge data sets so that researchers can obtain an answer to these biological enigmas. But…Biostatistics….why? This is the question one should address when starting to work in this field. Biostatisticians are often asked to justify why they choose this area to start or even improve their professional career.

Data analysis has always been performed. Before the 19th century, most scientists with a basic knowledge in Statistics were able to carry out simple calculus to validate their daily scientific experiments. The starting point of modern Biostatistics applications was set up in the past two centuries, with Charles Darwin and Francis Galton, among others. Besides, the latter one was the cofounder of the well-known statistical journal Biometrika. In the last decades, the complexity of scientific research studies (design, studied sample…) and the development of technology have grown enormously. This has led to the development of complicate statistical methods – sometimes ad hoc – and, consequently, to the requirement of specific skills for performing them: apart from Statistics, knowledge in medical topics and computer programming is highly recommended.

There are several papers which remark the importance of a biostatistician in biomedical sciences (e.g., Bross (1974); Donald W. Marquardt (1987), Greenhouse S. (2003); María Jesús Bayarri et al. (2012)). It is clearly revealed that the role of a data analyst – we are often called this way, and I have to admit I somehow dislike this term – is not as simple as the one of a shoe store clerk: I mean, we cannot sit and wait for requests coming from clinicians or other researchers who need to develop multiple regression analyses (most times) to obtain results. A statistician must be ambitious, have adventures with data, “play” with them and search for better statistical strategies than the current ones. There is always place for improvement. We are seen as data-machines/compilers looking for statistical significance (p<0.05) and we should show to other professionals that our daily work: (a) is not based on “significance”; (b) may influence resulting policy choices made by governments or other important organizations. In other words, the general public should perceive that our role is much more than pressing a button and getting the result in 5 minutes. Our function is to challenge and influence the community in order to hopefully make the society a better one. Fortunately, important biomedical journals such as Journal of American Medical Association (JAMA) have begun to give more relevancy to the complexity of the statistical procedure. It is the first step.

Biostatistics in Spain

Compared to other countries, Biostatistics could be considered as an emerging discipline in Spain. Although there is still much work to do, it is remarkable that it has been perceived an increasing demand for biostatisticians in the scientific community. Due to the new rising areas such as Genomics, spatial Statistics or Functional Data Analysis, several multidisciplinary research groups have been set up with at least a biostatistician being part of it. The National Biostatistics Network BIOSTATNET is a proof for this. This network, created in 2010, is composed of 8 nodes from different regions of Spain aiming to coordinate and promote research in Biostatistics.

In my opinion, I think I have given many reasons for choosing Biostatistics as a profession. It is a field where you can be linked to people coming from different areas which allows you to learn about many more topics than expected. In a few words, Biostatistics grips you!

“Statistical thinking will one day be as necessary for efficient citizenship

as the ability to read and write”

Herbert George Wells


Two is a crowd

When it comes to networking in Biostatistics, the well-known rule of the 6 degrees of separation seems to get narrower.

Intrigued by Michael Salter-Townshend´s article in the last month´s Significance Big Data Special Issue, I tried the InMaps Linkedin application for both my profile and Biostatnet´s (with the permission of  its main researchers).

At first glance, it can be noticed that there are obvious differences between the two of them, most probably due to the fact that mine includes friends and family that are not necessarily linked to the field of Biostatistics, and therefore does not show such a clear conglomerate of mutually linked connections (or small world network), rather being divided in two main clusters (forming a sort of scale-free network): one that could be identified with my social life and previous studies (dark turquoise), and the other one (rest of colours) intimately related to my  current employment. It is also worth noticing that the coloured clusters in Biostatnet´s map are not necessarily associated to the nodes that constitute the network, but to the different areas of study (clinical, applied,…) instead. This clearly reflects the multidisciplinary nature of an area of study that requires of other fields such as Biology, Computing, Mathematics and Medicine for its successful development.

However, the importance of these maps does not just lie in the identification of clusters but in the potential for inferring further information from them. As a matter of fact, it has been shown that the often criticized social networks, can not only help us when bored or looking for a job, but do also encourage and make interdisciplinarity easier, and provide researchers with essential information for the study of scientific phenomena such as the spread of epidemics, since this is very often determined/affected by social interaction (see papers by Liu and Xiao and Corner et al.). This also applies to the study of the distribution of species in ecological niches whose analysis is certainly similar to that of social networks (see papers by Johnson et al and Coleing). It has been proved that those species that are involved in a trophic chain with more and better connections, will be more likely to survive should any changes in their environment happen.

In conclusion, it seems that when networking, two highly-connected contacts are already a crowd and provide much more information than we could ever imagine, so…let´s network!!

Have you tried with yours? Any surprises there? Have you used network analysis in your research? Tell us about it!!

Jede II Review

The “II Congreso de Jóvenes Investigadores en Estadística: Diseño de Experimentos y Bioestadística (JEDE II)” was held in Tenerife (Spain) from the 18th to the 20th of July 2012. The following is a review of the event.

Biostatistics as a discipline of its own is slowly developing in Spain. As such, students in the area find it difficult to interact with each other, to access training on advanced skills and in final instance, to work as biostatisticians.

JEDE II´s initial aim was to provide both knowledge and experience sharing for young researchers in Biostatistics (as it is very well explained in the conference´s program), and we can confirm (this blog is a proof for that!) that it definitely succeeded.

From the invited sessions and the other young researchers´ presentations and posters, a highly motivated young audience was able to learn and discuss about the following topics:


Vicente Núñez and Jesús Fidalgo, two of the 8 main researchers of the  National Biostatistics Network Biostatnet , gave a very interesting and encouraging presentation specially aimed to young biostatisticians, providing information both on the network and Biostatistics in general.


Vicente Lustres´s talk also came as a ray of hope for all the researchers attending the conference. The creation of academic spin-offs like Biostatech, whose scientific director is Carmen Cadarso, also main researcher of Biostatnet, appears as one of the alternatives to institutional research that are sure to become an option for many departments in order to survive in the current context of budget cuts in Spanish institutions.

Bayesian Statistics

Bayesian Statistics as one of the paradigms of Statistics was also covered in Jede II by Vicente Núñez, Silvia Lladosa and Hèctor Perpiñán. They talked, respectively, about a Bayesian model to solve the overdispersion in Poisson models, Bayesian models to assess the spatial distributions of parasite species, and Bayesian longitudinal models applied to the evolution of pediatric renal transplants.


In the context of clinical Biostatistics, Inmaculada Arostegui´s talk focused on the validation of methods for obtaining optimal cut-off points  to categorize continuous predictor variables. Another problem to address in this field is how to handle missing data in clinical studies. Urko Aguirre talked about the performance of different statistical approaches to impute Health Related Quality of Life missing outcomes in longitudinal studies. Also in the broad area of medical Statistics, Isabel Martínez gave an example of the application of smoothed quantile regression in Pediatrics.

Pilar Cacheiro gave us a general overview of the field of statistical Genetics, focusing on statistical methods applied to Next Generation Sequencing data analysis and highlighting the opportunities for biostatisticians in this area of expertise.

In the area of Biology, Anabel Blasco and Altea Lorenzo provided and insight of the application of different statistical techniques in the areas of Botany and marine species reproduction respectively.

Design of Experiments

The subject of design of experiments was widely discussed during the conference.  Some speakers presented applications of optimal designs, like Peter Goos in block designs, Juan Rodriguez in spatial designs, Mariano Amo in kinetic processes and Mercedes Fernandez in multifactorial models. Victor Casero talked about experimental design in simultaneous equations and Roberto Dorta presented some optimal factorial designs.

Covering clinical trials, Jesús López-Fidalgo and José Antonio Moler talked about compound designs and optimality, and Arkaitz Galbete explained the use of randomization test in clinical trials. Finally, Licesio Rodriguez gave us a broad perspective on the use of R in design of experiments.


Another important event was  the roundtable on Advances in Design of Experiments and Biostatistics which also gave young researchers the chance to discuss and enquire about employment matters in the sector which could be identified as one of the hot topics of JEDE II given the current economic climate.

In conclusion, the conference was in our opinion a fantastic opportunity to get to know other young researchers and their lines of study, as well as giving us the chance to learn from reputed professionals that were kind enough to show interest in our research and gave us an insight in their fields of expertise.

It is only left for us to say…We cannot wait to the JEDE III Conference!!!


(Unfortunately, we can not either fit in here all the presentations of the conference or give more detailed descriptions, so please feel free to comment on your experience!)