Setting up your (Linux biostatistical) workstation from scratch


Facundo Muñoz, MSc in Mathematics, PhD in Statistics from University of Valencia. He is currently a postdoc researcher at the french Institute National de la Recherche Agronomique (INRA). His main research field is spatial (Bayesian) statistics applied to environmental and biostatistical problems. He is currently working on statistical methodologies for the analysis of forest genetic resources.

Being a busy biostatistician, I spend plenty of time glued to my computer. As an immediate consequence, every once in a while I need to set up my working station from zero. Either when I change job (I have done twice this year!), or when I want to update my OS version (upgrades rarely go perfect), or when I get a new laptop.

This involves, you know, installing the OS, the main programs I need for work like R and LaTeX, some related software like a good Control Version System, a couple of Integrated Development Environments for coding and writing, and a dozen of other ancillary tools that I use every now and then.

Furthermore, I need to configure everything to run smoothly, set up my preferences, install plugins, and so on.

Last time I did this manually, I spent a week setting everything up, and in the following days I always had something missing. Then I thought I should have got this process scripted.

Last week I set up my working environment in my new job. In a few hours I had everything up and running exactly the way I like. I spent an aditional day updating the script with new software, updated versions, and solving some pending issues.

I thought this script might be useful for others as well, hence this post. It is version-controled in a google code repository, where you can download the main script.

It is not very general, as installation details changes a lot from system to system. I use Linux Mint, but I believe it should go pretty straightforward with any derivative of Ubuntu, or Ubuntu itself (those distros using the APT package management). Other Linux branches (Arch, RedHat, Suse, Mac’s Darwin) users would need to make significant changes to the script, but still the outline might help. If you use Windows, well… don’t.

Of course, you will not be using the same software as I do, nor the same preferences or configurations. But it might serve as a guide to follow line by line, changing things to suit your needs.

In particular, it provides an almost-full installation (without unnecessary language packages) of the very latest LaTeX version (unlike that in the repos), and takes care of installing it correctly. It also sets up the CRAN repository and installs the latest version of R.

The script also installs the right GDAL and Proj4 libraries, which are important in case you work with maps in R or a GIS.

Finally, it installs some downloadable software like Kompozer (for web authoring), the Dropbox application, and more. It scrapes the web in order to fetch the latest and right versions of each program.

I hope it helps someone. And if you have alternative or complementary strategies, please share!


FreshBiostats´ First Anniversary

So here we are. It has been a year since we started this venture. The idea of a blog came up from one of our co-bloggers at the Jede II Conference in the summer of 2012.  At first it sounded like a bit of a challenge, but who said fear?

No doubt about it, the balance has been highly positive. We are all for sharing knowledge and resources that might be valuable for others, and from our humble perspective we sincerely hope it might have been of some use. It has certainly been so for us, both by getting insight into particular subjects when writing the different posts and by diving into new topics covered by our co-bloggers and invited posts. Twitter and facebook have also allowed us to encourage interaction with colleagues and other bloggers, and we can now say our social and professional networks have certainly become bigger and stronger!

We have found it difficult at times to juggle our jobs and PhDs with writing our weekly posts but as we said in several occasions, we are passionate about our work, and firmly believe that, most of the time, the line between work and fun gets blurry.

As we promised in a previous post, here is an infographic summarising this year of Fun & Biostatistics, enjoy!


We have a very international audience with visits coming from 117 countries, and we are delighted to see that not only colleagues from our closest network are reading our entries. Since our participation on the International Statistics Year blog -Statistics2013- and after being mentioned in other blogs such as RBloggers and others, we have gained more visibility and some posts have become very popular (more than 1000 views for some!).

Those posts focusing on R tips clearly take the cake, being the most visited. We guess they might be the most useful ones, as we are also big fans of other very practical blogs. However we like to cover all the aspects of our profession and even sometimes deal with more controversial or philosophical subjects…

We will keep inviting people to share their knowledge and will encourage colleagues to get involved in the blog. Our second year resolution is to make an effort to make of this blog a more interactive tool. We count on you for that!

Remember you can contact us with your comments, suggestions, and enquiries at freshbiostats@gmail.com

Thank you so much for being there!


Interview with…Jorge Arribas


Jorge is a BSc Pharmacy from the UPV-EHU and he is currently a resident trainee in Microbiology and Parasitology at the H.C.U. Lozano Blesa in Zaragoza, working also on his PhD thesis.

Email: Jarribasg(at)salud(dot)aragon(dot)es

 1.     Could you give us some insight in your current field of research?

My PhD thesis focuses on the diagnosis of Hepatitis C Virus (HCV) infection by means of core antigen determination, and I am also collaborating in a line of research focusing on new treatments for H. pylori infection, funded by a FIS grant. The former, intends to perform a comparison with respect to the current technique in use for the diagnosis. The latter, analyses resistance to different Flavodoxin inhibitors.

2.     Where does Biostatistics fit in your daily work?

In both areas of research that I am working on, since they are essentially comparative studies against established techniques, and therefore require techniques to prove the significance – or lack of significance- of the improvements.

3.     Which are the techniques that you or researchers in your area use more often?

Statistical techniques such as sensitivity and specificity analysis, hypothesis testing (ANOVA, t-test). There is also a particular need in the area for techniques dealing with ordinal data.

4.     As a whole, do you find Biostatistics relevant for your profession?

A very important part of the speciality of Microbiology and Parasitology focuses on the research of new diagnostic methods, treatments, prevalence of antibiotic resistance, etc. Therefore, Biostatistics becomes extremely useful when comparing these novel approaches to previous ones.

5.     Finally, is there any topic you would like to see covered in the blog?

It would be great to see published some examples of statistical applications in my area of study.

Selected publications:

  • J. Arribas, R. Benito , J. Gil , M.J. Gude , S. Algarate , R. Cebollada , M. Gonzalez-Dominguez , A. Garrido, F. Peiró , A. Belles , M.C. Rubio (2013). Detección del antígeno del core del VHC en el cribado de pacientes en el programa de hemodiálisis. Proceedings of the XVII Congreso de la Sociedad Española de Enfermedades Infeccionas y Microbiología Clínica (SEIMC).
  • M. González-Domínguez, C. Seral, C. Potel, Y. Sáenz, J. Arribas, L. Constenla, M. Álvarez, C. Torres, F.J. Castillo (2013). Genotypic and phenotypic characterisation of methicillin-resistant Staphylococcus aureus (MRSA) clones with high-level mupirocin resistance in a university hospital. Proceedings of the 23nd European Congress of Clinical Microbiology and Infectious Diseases and 28th International Congress of Chemotherapy.
  • M. González-Domínguez, R. Benito, J. Gil,  MJ. Gude, J. Arribas, R. Cebollada, A. Garrido, MC. Rubio (2012).  Screening of Trypanosoma cruzi infection with a chemiluminiscent microparticle immunoassay in a  Spanish University Hospital. Proceedings of the 22nd European Congress of Clinical Microbiology and Infectious Diseases and 27th International Congress of Chemotherapy.