How do you organise yourselves?

In this blog we usually talk about applications of statistics in different fields of knowledge or about the software we use in it. However, this week and after the Easter holidays, I would like to address the issue of the dedication of the researchers.

Just a few weeks ago we could read at international press news that France will limit the access to work calls and emails outside of the workplace therefore cutting down the effective time of daily work to consultants and engineers as an effort to ultimately increase their productivity by having some proper rest. And I asked myself, do researchers take these breaks?

During these years of PhD, I have had many conversations with my colleagues about periods of stress and tools that could help us. Linking back to the news of our French neighbours, limiting working hours can certainly help those who feel that their workday never ends, either by lack of organization or excessive workload. This limitation would result in increased availability of time to perform other activities to relieve the mind and would allow us to return to work the following days rested and motivated, and to be, in the long run, more productive.

From my point of view, based on my experience as a researcher, the issue is more complicated in this case due to the general absence of a working day schedule. We must be able to juggle our sometimes “necessary” labour flexibility with performing other daily tasks. Here’s a rare case where flexibility, which is a priori evaluated positively, can hinder the organisation of our life.

My answer for some time now is to try to implement the GTD methodology in my life, proposed by David Allen on the book “Getting Things Done: The Art of Stress-Free Productivity”. GTD basically means to unify the management of my professional, personal, fun life etc..

The main lesson to take from GTD in research is to be clear about what goals we pursue so we can clearly address them with direct actions, in order to avoid devoting too much time to tasks that do not really help us move forward (professionally or personally).

Are you in this situation too? Share with us your tricks and solutions to avoid stress through better government of time!

  • In France, a Move to Limit Off-the-Clock Work Emails. 12/04/2014. New York Times.
  • Allen, David (2001). Getting Things Done: The Art of Stress-Free Productivity. New York: Penguin Putnam. ISBN 978-0-14-200028-1.

More on handling data frames in R: dplyr package

I am currently taken an edX course, which is becoming one of my favorite online platforms, along with coursera. This time it is “Data Analysis for Genomics”, conducted by Rafael Irizarry, you might know him by his indispensable Simply Statistics blog. The course has just started, but so far the feeling is really good. It has been through this course that I have found about a new package by the great Hadley Wickhamdplyr. It is meant to work exclusively with data frames and provides some improvements over his previous plyr package.

As I spend quite a bit of time working with this kind of data, and having written a post some time ago about how to handle multiple data frames with the plyr package, I find it fair to update on this one. Besides I think it is extremely useful, and I have already incorporated some of its functions to my daily routine.

This package is meant to work with data frames and not with vectors like the base functions. Its functionalities might replace the ddply function in plyr package (one thing to mention is that is not possible to work with lists yet –as far as I know-). Four functions: filter( ) – instead of subset- , arrange( ) –instead of sort -, select( ) –equivalent to using select argument in subset function-, and mutate( ) – instead of transform- are, in my opinion, reason enough to move to this package. You can check here some examples on using these functions. The syntax is clearly improved and the code gets much neater, no doubt about it.

Two other essential functions are group_by( ) that allows to group the data frame by one or several variables and summarise( ) for calculating summary statistics on grouped data.

You can find general information about the package at the rstudio blog or several other blogs talking about its goodness, here or here.

Not to mention its other great advantage, the speed, not a minor issue for those of us who work with extremely large data frames. Several speed tests have been perfomed (here and here),( and it seems to clearly outperform the speed of plyr or data.table packages***).


I am so glad to have found it… I hope you will be too!


***Additional speed tests results will be published soon since this statement might be wrong.