set.seed(123)

Data

A series of \(n\) trials with \(y\) successes. We’re estimating the probability of success, \(\theta\).

n <- 10
y <- 5

Grid approximation

Using a uniform prior. Adapted from Richard McElreath, Rethinking Statistics.

grid_size <- 100
p_grid <- seq(from = 0, to = 1, length.out = grid_size)
prior <- rep(1, grid_size)
likelihood <- dbinom(x = y, size = n, prob = p_grid)
unstd.posterior <- likelihood * prior
posterior <- unstd.posterior / sum(unstd.posterior)

plot(p_grid, posterior, type = "b")

Analytic solution

Writing the prior as a beta distribution, because the beta distribution is the conjugate prior distribution for the binomial likelihood.

If \(\alpha = \beta = 1\), the beta distribution is uniform between 0 and 1.

a <- 1
b <- 1

apost <- y + 1
bpost <- n - y + 1

plot(p_grid, dbeta(p_grid, apost, bpost))

nsims <- 10000
posterior_samples <- rbeta(nsims, apost, bpost)
mean(posterior_samples)

## [1] 0.4996092

median(posterior_samples)

## [1] 0.4980506

Sampling using Stan

library("rstan")

## Loading required package: ggplot2

## Loading required package: StanHeaders

## rstan (Version 2.17.3, GitRev: 2e1f913d3ca3)

## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)

options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

binomial_model <- stan_model("binomial.stan")

binomial_model

## S4 class stanmodel 'binomial' coded as follows:
## data {
##   int n;
##   int y;
## }
## parameters {
##   real<lower=0, upper=1> theta;
## }
## model {
##   y ~ binomial(n, theta);
##   theta ~ beta(1, 1);
## }

binomial_fit <- sampling(binomial_model, data = list(y = y, n = n))

You can combine stan_model() and sampling() into one step by just running stan().

binomial_fit

## Inference for Stan model: binomial.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##        mean se_mean   sd   2.5%   25%   50%   75% 97.5% n_eff Rhat
## theta  0.50    0.00 0.14   0.23  0.40  0.50  0.60  0.77  1528    1
## lp__  -8.87    0.02 0.78 -11.01 -9.03 -8.58 -8.38 -8.32  1922    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Apr 12 00:38:17 2018.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(binomial_fit)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

For more plotting methods, we can use the bayesplot package, which we will discuss in more detail later.

library("bayesplot")

## This is bayesplot version 1.5.0

## - Plotting theme set to bayesplot::theme_default()

## - Online documentation at mc-stan.org/bayesplot

mcmc_dens(as.array(binomial_fit), pars = "theta") + xlim(0, 1)

## Scale for 'x' is already present. Adding another scale for 'x', which
## will replace the existing scale.

Estimating the posterior with a binomial likelihood

Jeffrey Arnold

Connor Gilroy

2018-04-10

Data

Grid approximation

Analytic solution

Sampling using Stan