Review of the 3rd Biostatnet General Meeting “Facing challenges in biostatistical research with international projection”


(Photo from Biostatech website)

Following on from successful meetings back in January 2011 and 2013,  on the 20-21 January 2017 Biostatnet members gathered again in Santiago to celebrate the network’s successes and discuss future challenges. These are our highlights:

Invited speakers

The plenary talk, “Why I Do Clinical Trials: Design and Statistical Analysis Plans for the INVESTED Trial”, introduced by Guadalupe Gómez Melis, was given by Prof. KyungMann Kim, from the Biostatistics and Medical Informatics department at the University of Wisconsin. Prof. Kim discussed challenges faced when conducting clinical trials to ensure follow-up of patients, and the technological and statistical conflicts in the endeavour to make large-scale clinical trials cost-efficient.


Prof. KyungMann Kim’s presenting his work

The invited talk by Miguel Ángel Martínez Beneito from FISABIO showcased solutions to issues with excess zeros modelling in disease mapping in a very enjoyable talk that generated a fascinating discussion.


Inmaculada Arostegui introducing Miguel Ángel Beneito’s talk


We really enjoyed the two great roundtables that were held at the meeting.

Young Researchers Roundtable

Firstly, we were delighted to be given the opportunity to organise a roundtable of young researchers at the event, and although we did not have much time, we managed to squeeze in four main topics of discussion including; Biostatnet research visits, reproducibility in research, professional visibility and diversity with a focus on women in Biostatistics (very fittingly just before the celebrations of the 1st International Day of Women and Girls in Science!). The topics proved to be of great interest and raised a lively discussion. Regarding visibility, issues such as how to properly manage a professional online profile and what the potential risks of too much exposure are, were raised in the discussion. Also mentioned was the fact that, while there is no doubt that accessing data and code for replicability and reproducibility purposes is highly important, researchers might lose sight of the conclusions due to a major focus on having full access to these resources.

Some other interesting issues were prompted that we could not expand on (we would have needed more time…!) so we would like to continue here…Feel free to send us your comments either here or via social media! or to answer this brief survey (post here). We are currently preparing a paper summarising the topics covered in the roundtable and we will let you know when it is ready!


Participants in the roundtable from right to left: Danilo Alvares (with contributions from Elena Lázaro), Miguel Branco, Marta Bofill, Irantzu Barrio and María José López.

BIO Scientific Associations Roundtable

Secondly, a very exciting and lively session gathering researchers with different backgrounds, all members of a variety of BIO associations and networks, gave us their impressions about what it means to work within a multidisciplinary team. In a very constructive and lively atmosphere, promoted by the moderator Erik Cobo (UPC), Juan de Dios Luna (Biostatnet), Vitor Rodrigues (CIMAGO), Mabel Loza (REDEFAR) and Marta Pavía (SCReN) discussed the pitfalls in communication between statisticians and researchers. We all really enjoyed this session, and took home some valuable messages to improve the interactions between (bio)statisticians, clinicians and applied researchers.


In addition to a satellite course on “CGAMLSS using R” (post to come on this topic!), we had the opportunity to attend two workshops on the last morning of the meeting.

Juan Manuel Rodriguez and Licesio Rodríguez delivered the Experimental Designs workshop. In a very funny and lively way (and using paper helicopters!!!!), they reviewed key concepts of Experimental Designs. This helpful workshop gave us a great opportunity to dust off our design toolkit.

In the software workshop moderated by Klaus Langohr, Inmaculada Arostegui and Guillermo Sánchez introduced two interactive web tools for predicting adverse events in bronchitis patients (PrEveCOPD) and biokinetic modelling (BiokmodWeb). Esteban Vegas showed a Shiny implementation of kernel PCA, and David Morina and Monica Lopez presented their packages radir and GsymPoint.

Oral communications

Although there were also presentations from less young researchers, there were plenty of sessions for younger members who have been getting a great support from the network (thank you, Biostatnet!). The jury decided that the best talks were “Beta-binomial mixed model for analyzing HRQoL data over time” by Josu Najera-Zuloaga. In his talk, he introduced an R-package, HRQoL, including the methodology to perform health-related quality of life scores in a longitudinal framework based on beta-binomial mixed models, as well as an application in patients with chronic obstructive pulmonary disease. The second awarded talk was “Modelling latent trends from spatio-temporally aggregated data using smooth composite link models. ” by Diego Ayma on the use of penalised composite link models with mixed model representation to estimate trends behind aggregations of data in order to improve spatio-temporal resolutions. He illustrated his work with the analysis of spatio-temporal data obtained in the Netherlands. Our own Natalia presented joint work with researchers from the Basque country node  on the application of Multiple Correspondence Analysis for the development of a new Attention-Deficit/Hyperactivity Disorder (ADHD) score and its application in the Imaging Genetics field. This work was possible thanks to one of Biostatnet grants for research visits.


Natalia’s presentation

Posters sessions

Three poster sessions were included in the meeting. As a novelty, these sessions were preceded by a brief presentation in which all participants introduced their posters in 1-2 minutes. Although imposing at first, we thought this was a good opportunity for everyone to at least hear about the work displayed, in case they missed the chance to either see the poster or talk to the author later on. The poster “An application of Bayesian Cormack-Jolly-Seber models for survival analysis in seabirds” by Blanca Sarzo showed her exciting work in a very instructive and visually effective way and won her a well-deserved award too.

Biostatnet sessions

Particularly relevant to Biostatnet members were also the talks by by three of the IPs, Carmen Cadarso, and Guadalupe Gómez and Maria Durbán (pictured below) highlighting achievements and future plans for the network. Exciting times ahead!


Last but not least, the meeting was a great opportunity for networking while surrounded by lovely Galician food and licor cafe ;p

We look forward to the 4th meeting!


Galician delicacies!

You can check the hashtag #biostatnet2017, tell us your highlights of the meeting here or send us your questions if you missed it…

(Acknowledgements: sessions pics by Moisés Gómez Mateu and Marta Bofill Roig)


Videos: “Bioestadística para Vivir”

A set of online video tutorials recorded during the Cycle of Conferences “Bioestadística para Vivir” can be found here. These talks are aimed to the general public and try to show the impact of Biostatistics on everyday life, especially in the fields of health and the environment.

Biostatnet is one of the collaborating members of this project,“ Bioestadística para Vivir y Cómo Vivir de la Estadística”, which is promoted by the University of Santiago de Compostela, through its Unit of Biostatistics (GRIDECMB), and the Galician Institute of Statistics. This knowledge exchange initiative is funded by the Spanish Science & Technology Foundation (FECYT).

Other activities are being developed under this project; the exhibition Exploristica – Adventures in Statistics: an itinerant exhibition  teaching Statistics to secondary school students, Cycles of Conferences, Problem solving workshops, etc. You can find out more here .



‘Diss’ and tell!

Prior to our presentation at the JEDE III conference on “assessment of impact metrics and social network analysis”, we’d like to get a flavour of the research dissemination that’s going on out there so we can catalogue some of the resources used by researchers in Biostatistics. As Pilar mentioned in previous posts (read here and here), we’re active users of online tools such as PubMed and Stackoverflow, and although we’re relatively new to aggregators such as ResearchBlogging, we really believe in their value as disseminators in the current Biostatistics arena.

Still, we feel we must be missing many others…So we’re interested both in what you use to advance your research but also how you promote it.

We’d be very grateful for your answers to the following questions (we’ll show a summary of the responses at the end of the month):

We welcome your further comments on this topic below and look forward to reading your answers.  

Hopefully see you in Pamplona!


Interview with…..

  Arantzazu Arrospide


 Arantzazu Arrospide Elgarresta studied mathematics in the University of the Basque Country (UPV/EHU) and works as a biostatistician in the Research Unit of Integrated Health Organisations in Gipuzkoa. This research unit gives support to four regional hospitals (about 100 beds each one) and all the public Primary Care Health Services in Gipuzkoa.

Email: arantzazu.arrospideelgarresta@osakidetza.net


Irantzu Barrio

fotoIrantzu  Acting teacher at the Department of Applied  Mathematics, Statistics and Operational Research of the    University of the Basque Country (UPV/EHU)

 Email: irantzu.barrio@ehu.es


Both young biostatisticians are currently working on several ongoing research projects. They belong to the Health Services Research on Chronic Patients Network (REDISSEC) – among others  biostatisticians – and tell us what they think about Biostatistics.

1.    Why do you like Biostatistics?

Irantzu Barrio: On one hand I like applying statistics to real problems, data sets and experiments. On the other hand, I like developing methodology which can contribute to get better results and conclusions in each research project. In addition, I feel lucky  to work in multidisciplinary teams. This allows me to learn a lot from other areas and constantly improve on mine own, always looking for ways to provide solutions to other researchers needs.

Arantzazu Arrospide: I think Biostatistics is the link between mathematics and the real world, giving us the opportunity to feel part of advances in scientific research.

2.    Could you give us some insight in your current field of research?

AA: Our main research line is the application of mathematical modeling the evaluation of public health interventions, especially economic evaluations. Although Markov Chain models are the most common methods for this kind of evaluations we work with discrete event simulation models which permit more flexible and complex modeling.

IB: I’m currently working on my PhD thesis. One of the main objectives of this work is to propose and validate a methodology to categorize continuous predictor variables in clinical prediction model framework. Specifically we have worked on logistic regression models and Survival Models.

3.    You have been doing an internship abroad. What was the aim of your stay?

IB: I did an internship in Guimaraes at the University of Minho, Portugal. During my stay, I worked together with Luis Filipe Meira Machado and María Xosé Rodriguez-Alvarez. The aim was to learn more about survival models and extend the methodology developed so far, considering different prediction models.

AA: I did a short stay in the Public Health department of the Erasmus Medical Centre in Rotterdam (Netherlands) last November. The aim of the visit was to discuss the validation of a discrete event simulation model developed to estimate the health effects and costs of the breast cancer screening program in the Basque Country.

4.    What did allow you to do that was has not been possible in Spain?

IB: Oh! It’s amazing when you realize you have all your time to work on your research project, one and a unique priority for more than two months. Of course, all the other to do’s did not disappeared from my calendar, only were postponed until my return to Bilbao. And, in addition to that, it was also a privilege to work together with high experienced biostatisticians and to have the opportunity to learn a lot from them.

AA: The research group I visited, internationally known as the MISCAN group, is the only European member of the Cancer Intervention and Surveillance Modeling Network (CISNET) created by the National Cancer Institute in the United States. Their main objective is to include modeling to improve the understanding of the impact of cancer control interventions on population trends in incidence and mortality. These models then can project future trends and help determine optimal control strategies. Currently, Spanish screening programs evaluation is mainly based on the quality indicators recommended by the European Screening Guidelines which do not include a comparison with an hypothetical or estimated control group.

5.    Which are the most valuable aspects to highlight during your internship? What aspects do you believe that might be improved?

IB: I would say that my internship was simply perfect. When I came back to Bilbao I just thought time had gone really really fast. I’m just looking forward to go back again.

AA: This group works for and in collaboration with their institutions. They are the main responsible of evaluation of ongoing screening programs, prospective evaluation of screening strategies and leaders for new randomized trials in this topic. This is the reference group in the Netherlands for cancer screening interventions and their institutions consider their conclusions when making important decisions.

6.    What do you think of the situation of young biostatisticians in Spain?

AA: When you work in a multidisciplinary research group both methodological and disease specific knowledge are essential and it takes a long time to achieve it. Institutional support is necessary to obtain long term funds that would ensure future benefits in healthcare research based on rigorous and innovative methods.

IB: I think the situations for young biostatisticians and for young people in general is not easy right now. And at least for what I see around me, there is lot of work to do for.

7.    What would be the 3 main characteristics or skills you would use to describe a good biostatistician? And the main qualities for a good mentor?

AA: Open minded, perfectionist and enthusiastic. As for the mentor, he/she  should be strict, committed and patient.

IB: In my opinion good skills on statistics, probability and mathematics are needed. But at the same time I think it is important to be able to communicate with other researchers such as clinicians, biologists, etc, specially to understand which are their research objectives and be able to translate bio-problems to stat-problems.

For me it is very important to have good feeling and confidence with your mentor. I think that having that, everything else is much easier. On the other hand, if I had to highlight some qualities, I would say that a good mentor would: 1) Contribute with suggestions and ideas 2) Supervise the work done and 3) be a good motivator.

8.    Finally, is there any topic you would like to see covered in the blog?

IB: I think the blog is fantastic, there is nothing I missed in it. I would like to congratulate all the organizing team, you are doing such a good job!!! Congratulations!!!

AA: Although it is not considered part of statistical science operational research methods also can be of interest in our researches.

Selected publications (6):

Arrospide, A., C. Forne, M. Rue, N. Tora, J. Mar, and M. Bare. “An Assessment of Existing Models for Individualized Breast Cancer Risk Estimation in a Screening Program in Spain.”. BMC Cancer 13 (2013).

Barrio, I., Arostegui, I., & Quintana, J. M. (2013). Use of generalised additive models to categorise continuous variables in clinical prediction. BMC medical research methodology13(1), 83.

Vidal, S., González, N., Barrio, I., Rivas-Ruiz, F., Baré, M., Blasco, J. A., … & Investigación en Resultados y Servicios Sanitarios (IRYSS) COPD Group. (2013). Predictors of hospital admission in exacerbations of chronic obstructive pulmonary disease. The International Journal of Tuberculosis and Lung Disease17(12), 1632-1637.

Quintana, J. M., Esteban, C., Barrio, I., Garcia-Gutierrez, S., Gonzalez, N., Arostegui, I., Vidal, S. (2011). The IRYSS-COPD appropriateness study: objectives, methodology, and description of the prospective cohort. BMC health services research11(1), 322.

Mar, J., A. Arrospide, and M. Comas. “Budget Impact Analysis of Thrombolysis for Stroke in Spain: A Discrete Event Simulation Model.”. Value Health 13, no. 1 (2010): 69-76.

Rue, M., M. Carles, E. Vilaprinyo, R. Pla, M. Martinez-Alonso, C. Forne, A. Roso, and A. Arrospide. “How to Optimize Population Screening Programs for Breast Cancer Using Mathematical Models.”.


Have an animated 2014!

Animations can be a refreshing way to make a website more attractive, as we did exactly a year ago here.

Based on the Brownian motion simulated here, and using animation and ggplot2 R packages, we produced a fun welcome to the International year of Statistics, Statistics2013. The function saveMovie (please notice that there exists another alternative, saveGIF) allowed us to save it and finally publish it, as easy as that!

For this new year we have based our work on the demo(‘fireworks’) from the 2.0-0 version of the package animation (find  here a list of the changes in each version) and world from the package maps (speaking of maps, have a look at this great dynamic map by Bob Rudis!).

They also come very handy when trying to represent visually a dynamic process, the evolution of a time-series,etc.

As an example, in a previous post we  portrayed an optimisation model in which the values for the mean were asymptotically approaching the optimal solution, which was achieved after a few iterations. This was done with a for loop and again packages ggplot2 and animation.

There are many other examples of potential applications both in general Statistics – see this animated representation of the t distribution and these synchronised Markov chains and posterior distributions plots– and  in Biostatistics. Some examples of the latter are this genetic drift simulation by Bogumił Kamiński and this animated plots facility in Bio7, the integrated  environment for ecological modelling.

R package caTools also allows you to read and write images in gif format.

Finally, in LaTeX, \animategraphics from the animate package will do the trick; check this post by Rob J. Hyndman for further details.

It is certainly one of our new year’s resolutions to incorporate more animations in our posts, what are yours?


My impressions from the Workshop on the Future of the Statistical Sciences

An exciting workshop, organised as the capstone of the International Year of Statistics Statistics2013, was celebrated on the 11th and 12th of November at the Royal Statistical Society Headquarters in London, under the stimulating title of  “The Future of the Statistical Science”, science to which John Pullinger, president of the Royal Statistical Society, refers to as a quintessentially, multidisciplinary discipline. The workshop was indeed a good example of that.

Online attendance was welcomed to the event giving an additional opportunity to listen to highly reputed professionals in main areas of Statistics. Virtual participants were also allowed to pose their questions to the speakers, an innovation that worked very well as an excellent way to make these sorts of events available to a wider audience that would otherwise be excluded – I am already looking forward to more! -.

In case you missed it at the time, luckily the podcast is still available here.

True to its description, the event covered different areas of application, it showed the tools of the field for tackling a wide range of challenges, it portrayed potentially high impact examples of research, and in summary, was a great attention-grabbing exercise that will hopefully encourage other professionals to the area. A common vision of ubiquity and great relevance came across from the speeches and showed the field as a very attractive world to join. See my simplified summary below in form of a diagram certainly reflective of the great Statistics Views motto “Bringing Statistics Together”.


Fig 1. Some of the ideas discussed in the workshop

Particularly relevant to this blog and my particular interests were presentations on environmental Statistics and statistical Bioinformatics, as well as other health-related talks. Important lessons were taken on board from the rest of the speakers too.

The workshop started with a shared presentation on the hot topic “Statistical Bioinformatics” by Peter Bühlmann (ETH Zürich) and Martin Vingron (Free University of Berlin). In a fascinating talk, Bühlmann argued for the need of uncertainty quantification in an increasingly heterogeneous data world –examples of this are also appearing in other fields, e.g. in the study of autism disorders as Connie Kasari and Susan Murphy mentioned in “SMART Approaches to Combating Autism”- and for models to be the basis of the assignation of uncertainties – topic greatly covered by Andrew Gelman in “Living with uncertainty, yet still learning”-, while acknowledging that in the big data context, “confirmatory statistical inference is (even more) challenging”. Vingron followed by focusing on “Epigenomics as an example”, raising open questions to the audience on how to define causality in Biology where “most processes […] are feedback circular processes”, calling for models that are just complex enough so as to allow for mechanistic explanations, and for good definitions of null hypotheses.

In addition to the interesting points of the talk, I found its title particularly attractive in what it could be directly linked to a vibrant roundtable on “Genomics, Biostatistics and Bioinformatics” in the framework of the 2nd Biostatnet General Meeting, in which, as some of you might remember, the definitions of the terms Biostatistics and Bioinformatics were discussed. I wonder if the term “statistical Bioinformatics” would be indeed the solution to that dilemma? As a matter of fact, Bühlmann himself mentions at the start of his talk other options like Statistical Information Sciences, etc-

Michael Newton and David Schwartz from the University of Wisconsin-Madison also focused on the triumph of sequencing and “Millimeter-long DNA Molecules: Genomic Applications and Statistical Issues”. Breast cancer being one of the mentioned applications, this was followed by an introduction to “Statistics in Cancer Genomics” by Jason Carroll and Rory Stark (Cancer Research UK Cambridge Institute, University of Cambridge), particularly focusing on breast and prostate cancer and the process of targeting protein binding sites.  The latter, as a computational biologist, mentioned challenges on the translation of Statistics for the biologists and vice versa, and identified ”reproducibility as (the) most critical statistical challenge” in the area – also on the topic of reproducibility, it is especially worth watching “Statistics, Big Data and the Reproducibility of Published Research” by Emmanuel Candes (Stanford University) -.

Another area increasingly gaining attention in parallel with technologies improvements is Neuroimaging. Thomas Nichols (University of Warwick) lead the audience in a trip through classical examples of its application (from Phineas Gage´s accident to observational studies of the structural changes in the hippocampi of taxi drivers, and the brain´s reaction to politics) up to current exciting times, with great “Opportunities in Functional Data Analysis” which Jane-Ling Wang (University of California) promoted in her talk.

Also in the area of Health Statistics, different approaches to dietary patterns modelling were shown in “The Analysis of Dietary Patterns and its Application to Dietary Surveillance and Epidemiology” by Raymond J. Carroll (Texas A&M University), and Sue Krebs-Smith (NCI), with challenges in the area being; finding dietary patterns over the life course – e.g. special waves of data would appear in periods like pregnancy and menarchy for women-, and incorporation of technology to the studies as a new form of data collection –e.g., pictures of food connected to databases-.

Again with a focus on the challenges posed by technology, three talks on environmental Statistics outlined an evolution over time of the field. Starting from current times, Marian Scott (University of Glasgow) in “Environment – Present and Future” stated in her talk that “natural variability is more and more apparent” because we are now able to visualise it. This idea going back to the heterogeneity claimed by Bühlmann. Despite the great amounts of data being available, and initially caused by technological improvements but ultimately due to a public demand – “people are wanting what it is happening now”-, the future of the field is still subject to the basic questions: “how and where to sample.” Especially thought-provoking were Scott´s words on “the importance of people in the environmental arena” and the need for effective communication: “we have to communicate: socio-economic, hard science…, it all needs to be there because it all matters, it applies to all the world…”

Amongst other aims of this environmental focus such as living sustainably – both in urban and rural environments-, climate is a major worry/fascination which is being targeted through successful collaborations of specialists in its study and statisticians. “Climate: Past to Future” by Gabi Hegerl (University of Edinburgh) and Douglas Nychka (NCAR) covered “Regional Climate – Past, Present and Future” and showed relevant examples of these successful collaborations.

Susan Waldron (University of Glasgow), under the captivating speech title of “A Birefringent Future”, highlighted the need to address challenges in the communication of “multiple layers of information” too, Statistics appearing as the solution for doing this effectively –e.g. by creating “accessible visualisation of data” and by incorporating additional environmental variation in controversial green energy studies such as whether wind turbines create a microclimate-. Mark Hansen (Columbia University Journalism School) seconded Waldron´s arguments and called for journalists to play an important role in the challenge: “data are changing systems of power in our world and journalists. […] have to be able to think both critically as well as creatively with data”, so as to “provide much needed perspective”. Examples of this role are; terms such as “computational Journalism” already being coined and data visualisation tools being put in place -as a fascinating example, the New York Times´ Cascade tool builds a very attractive representation of the information flow in Social Media-. David Spiegelhalter (Cambridge University) also dealt with the topic in the great talk “Statistics for an Informed Public”, providing relevant examples (further explanation on the red meat example can be found here as well).

To encourage “elicitation of experts opinions (to reach a) rational consensus” and “to spend more time with regulators” came up as other useful recommendations in the presentation “Statistics, Risk and Regulation” by Michel Dacorogna (SCOR) and Paul Embrechts (ETH Zürich).

In a highly connected world inmersed in a data revolution, privacy becomes another major issue. Statistical challenges arising from networked data –e.g. historical interaction information modelling- were addressed by Steve Fienberg (Carnegie Mellon University) and Cynthia Dwork (Microsoft Research), who argued that “statisticians of the future will approach data and privacy (and its loss) in a fundamentally algorithmic fashion”, in doing so explicitly answering the quote by Sweeney: “Computer science got us into this mess, can computer science get us out of it?”. Michael I. Jordan (University of California-Berkeley) in “On the Computation/Statistics Interface and “Big Data”” also referred to “bring(ing)  algorithmic principles more fully into contact with statistical inference“.

I would like to make a final mention to one of the questions posed by the audience during the event. When Paul Embrechts was enquired about the differences between Econometrics and Statistics, a discussion  followed on how crossfertilisation between fields has happened back and forth. As a direct consequence of this contact, fields rediscover issues from other areas. For instance, Credit Risk Analysis models were mentioned as being inferred from Medical Statistics or back in Peter Bühlmann´s talk, links were also found between Behavioural Economics and Genetics. These ideas, from my point of view, bring together the essence of Statistics, i.e. its application and interaction with multiple disciplines as the foundations of its success.

Many other fascinating topics were conveyed in the workshop but unfortunately, I cannot fit here mentions to all the talks. I am sure you will all agree it was a fantastic event.


FreshBiostats birthday and September-born famous statisticians

With the occassion of the 1st birthday of FreshBiostats, we want to remember some of the great statisticians born in September and that have contributed to the “joy of (bio)stats”.

Gerolamo Cardano Pavia, 24 September 1501 – 21 September 1576 First systematic treatment of probability
Caspar Neumann Breslau, 14 September 1648 – 27 January 1715 First mortality rates table
Johann Peter Süssmilch Zehlendorf, 3 September 1707 – 22 March 1767 Demographic data and socio-economic analysis  
Georges Louis Leclerc (Buffon) Montbard, 7 September 1707 – Paris, 16 April 1788 Premier example in “geometric probability” and a body of experimental and theoretical work in demography
Adrien-Marie Legendre Paris, 18 September 1752 – 10 January 1833 Development of the least squares method
William Playfair Liff, 22 September 1759 – London, 11 February 1823 Considered the founder of graphical methods of statistics (line graph, bar chart, pie chart, and circle graph)
William StanleyJevons Liverpool, 1 September 1835 – Hastings,13 August 1882 Statistical atlas – graphical representations of time series
Anders Nicolai Kiaer Drammen, 15 September 1838 – Oslo, 16 April 1919 Representative sample
Charles Edward Spearman London, 10 September 1863 – 17 September 1945 Pioneer of factor analysis and Spearman´s Rank correlation coefficient
Anderson Gray McKendrick Edinburgh, September 8, 1876 – May 30, 1943 Several discoveries in stochastic processes and collaborator in the path-breaking work on the deterministic model for the general epidemic
Maurice Fréchet Maligny, 2 September 1878 – Paris, 4 June 1973 Contributions in econometrics and spatial statistics
Paul Lévy 15 September 1886 – 15 December 1971 Several contributions to probability theory
Frank Wilcoxon County Cork, 2 September 1892 – Tallahassee, 18 November 1965 Wilcoxon rank-sum tests, Wilcoxon signed-rank test
Mikhailo Pylypovych Kravchuk Chovnytsia, 27 September 1892- Magadan, 9 March 1942 Krawtchouk polynomials, a system of polynomials orthonormal with respect to the binomial distribution
Harald Cramér Stockholm, 25 September 1893 – 5 October 1985 Important statistical contributions to the distribution of primes and twin primes
Hilda Geiringer Vienna, 28 September 1893 – California, 22 March 1973 One of the pioneers of disciplines such as molecular genetics, genomics, bioinformatics,…
Harold Hotelling Fulda, 29 September 1895 – Chapel Hill, 26 December 1973 Hotelling´s T-squared distribution and canonical correlation
David van Dantzig Rotterdam, 23 September 1900 -Amsterdam,  22 July 1959 Focus on probability, emphasizing the applicability to hypothesis testing
Maurice Kendall
Kettering, 6 September 1907 – London, 29 March 1983 Random number generation and Kendall´s tau
Pao-Lu Hsu Peking, 1 September 1910 – Peking, 18 December 1970 Founder of the newly formed discipline of statistics and probability in China

It is certainly difficult to think of the field without their contributions. They are all a great inspiration to keep on learning and working!!

Note: you can find other interesting dates here.

Update: and Significance´s timeline of statistics here.

Any author you consider particularly relevant? Any other suggestions?