Guest blogger: Tales of R

Collaboration between statisticians and researchers

In the last few days I was thinking about how researchers could collaborate efficiently with their experts in statistics. With the increasing complexity in science, interchanging information can be crucial to get the best results. But what happens when a researcher doesn’t give all the information a statistician needs?

When someone asks for help in this field -as a clinician, I need it often-, many biostatisticians ensure that some basic points are met in a research study -prior to the actual analysis-, and I think they should ask for them as soon as possible:

  • Is the research question sensible, and supported by the literature in any way? A good research idea should be accompanied by a short review with pros, cons and all the important references, which could guide you to the most appropriate or most used statistical methods in whatever field. Ask for it, read it. If the study has a flaw at this point, it’s easier to correct them now than later.
  • Is the research question defined and detailed at the end of the review? Does it have a main objective? Do they plan further analyses depending on the results? Do researchers give enough information for sample size calculation? With these points you can avoid getting stuck in the middle of a study for a long time. The scientific method is always a good guide if things go wrong.
  • Data? Ok, they have planned something taking into account their available knowledge. But how about the resources? Can they collect all the data they need? If not, how can the design be adapted? Often, they have to change something in the research question, and start again. But in the worst case it takes much less time than not doing things right from the beginning.

Have they -researchers, clinicians, whoever-, met all three tips above? Then, your chances of helping them to answer the question will increase, and even the time to the answer can decrease substantially.

I hope this has been useful!

You can find my blog “Tales of R…messing around with free code” here.


Sharing statistical analysis and results through web applications

I have to admit I am not completely comfortable with the RStudio IDE yet. However, RStudio projects are of great interest, and the new package Shiny – released last November- is no exception.

Shiny allows you to build web applications in R, so anyone might be able to access your analysis results through an interactive web browser interface. As they state in their website “…your users choose input parameters using friendly controls like sliders, drop-downs, and text fields. Easily incorporate any number of outputs like plots, tables, and summaries..”

The application is structured in two components: a user interface script named ui.R  and a server script – server.R. The first one deals with the input and output format, the second one contains the code to run the analysis.

I made a quick trial following the tutorial and my (very simple) web app was ready in no time. It is based on some of the examples of the SNPassoc vignette, an R package designed to perform genetic association studies. In this application, you can check the summary for a small set of SNPs and get both a plot showing the frequencies and the results of the association study for a given SNP.

By using Shiny, you can run your applications locally or share them with other users so they can run the applications in their own computers. There are several options to distribute your apps, you can check all of them here. In this case, the app can be accessed through the GitHub Git repository. Once the Shiny package is installed (along with the SNPassoc package used in this example) you just have to run the following code:

shiny:: runGist('4e5e618431a59abe692b')

In my opinion this tool has great potential for sharing analysis results through an interactive and friendly interface.  It might replace files containing a lot of graphics and tables while saving time both to the data analyst and the end user. Have you tried it? Do you want to share your apps with us?


Interview with… Guillermo Vinué Visús

Guillermo Vinué Visús completed his degree in Mathematics in 2008, granted by the Universitat de València (Spain). He also holds a Master’s degree in Biostatistics from the same university. After working at the Drug Research Center of the Santa Creu i Sant Pau Hospital in Barcelona (Spain) for a year, he is currently a PhD student in the Department of Statistics and Operations Research at the Universitat de València. His doctoral thesis focuses on Statistics applied to Anthropometry.

1. Why do you like Biostatistics?

I like Biostatistics because it allows me to apply Maths to different real life problems.

 2. Could you give us some insight in your current field of research?

 I am working on  a research project aiming to develop statistical methodologies to deal with anthropometric data, in order to tackle some statistical problems related to Anthropometry and Ergonomics.

3. Which are, in your opinion, the main advantages of being a researcher?

The main advantage is the possibility to learn everyday a bit more. Research is a continuous learning process.

4. Your whole professional experience has been within the public sector and  the University. How do you see the present and future of research in the Spanish public sector?

The current situation of economic difficulties has caused that unfortunately the government budget for scientific research is more and more limited, so I am concerned about both present and future of the public Spanish research.

5. What do you think of the situation of young biostatisticians in Spain?

Neither better nor worse than other young people in the country. Nowadays, I guess the best way to make progress is to move abroad.

6. What would be the 3 main characteristics or skills you would use to describe a good biostatistician?

 Enthusiasm, effort and a little bit of R knowledge.

7. Which do you think are the main qualities of a good mentor?

To be attentive and available when needed.

Selected publications:

  • Ibañez M. V., Vinué G., Alemany S., Simó A., Epifanio I., Domingo J., Ayala G., “Apparel sizing using trimmed PAM and OWA operators”, Expert Systems with Applications 39, 10512-10520, 2012, http://dx.doi.org/10.1016/j.eswa.2012.02.127
  • Epifanio I., Vinué G., Alemany S., “Archetypal analysis: Contributions for estimating boundary cases in multivariate accommodation problem”, Computers & Industrial Engineering 64, 757- 765, 2013, http://dx.doi.org/10.1016/j.cie.2012.12.011