# Introduction to Bayesian statistics

After starting the year with a post about the International Year of Statistics, we present this week our first post about Bayesian statistics. This post has been made jointly by Hèctor Perpiñán and Silvia Lladosa.

Any researcher (particularly all of those working in the field of statistics) is aware of the two main approaches to this science: Frequentist or Classical statistics and Bayesian statistics. The main difference between Bayesian and Frequentist is essentially a distinct interpretation of what probability means, and thus a different way to make inference.

The term Bayesian refers to Bayes theorem that was originally started by the Reverend Thomas Bayes (1702–1761) in one of his last papers called “An Essay towards solving a Problem in the Doctrine of Chances ” published in 1763 (note that this year is the 250th anniversary).

Into this post, we focus on introducing the basic aspects that characterize the Bayesian framework. First of all, we give a brief and simple definition on the principal idea of Bayesian statistics: it quantifies and combines all the uncertainty in the problem (data, parameters, etc.) in probabilistic terms. It is therefore understood as the degree of belief.

The basic procedure of Bayesian methodology involves:

• Assigning an initial probability distribution, $\pi(\theta)$, to the model parameters ($\theta$), which quantifies all the relevant information in them. This distribution must be chosen before seeing the data (it can not by any means be conditioned by these).

Bayesian statistics has been often criticized because the interpretation of prior probability distribution in terms of ‘beliefs’ seems subjective. But this is far from reality, you can choose different priors: subjective (it should be used when you have some information about the parameters) or objective (in situations where there is no information on them).

Although it has not been explicitly mentioned above, the fact that we can express our beliefs about the parameter by means of  a probability density function is the result of considering the parameters as random variables. This is one of the biggest differences that can be found with respect to classical statistics. It treats parameters as fixed but unknown.

• Choosing a probabilistic model that relates the random variables and the model parameters associated with the experiment. This allows us to express the information provided by the data, given the parameters, in probabilistic terms by using the likelihood function, $p(y|\theta)$.

The last step in this procedure is to apply Bayes theorem, to combine prior knowledge and new information to find the posterior probability distribution, $\pi(\theta|y)$, of $\theta$,

$\pi(\theta|y)=\frac{p(y|\theta)\pi(\theta)}{\pi(y)}\propto p(y|\theta)\pi(\theta)$ .

The posterior distribution is updated according to the data, i.e. prior probability is changed by the new evidence provided by the data information into posteriors. We can say that “Today’s posterior is tomorrow’s prior ”. This final distribution will allow us to calculate point estimates of parameters, credible intervals estimates, to make predictions, etc.

After this brief introduction to Bayesian methodology, we will continue in our next posts with: prior distributions, Bayesian hierarchical models, WinBUGS and much more.

We hope your fears are going aside and you start to use this powerful paradigm. Because as a professor once told us: “Bayesian statistics is a way of life “.