set.seed(123)
A series of \(n\) trials with \(y\) successes. We’re estimating the probability of success, \(\theta\).
n <- 10
y <- 5
Using a uniform prior. Adapted from Richard McElreath, Rethinking Statistics.
grid_size <- 100
p_grid <- seq(from = 0, to = 1, length.out = grid_size)
prior <- rep(1, grid_size)
likelihood <- dbinom(x = y, size = n, prob = p_grid)
unstd.posterior <- likelihood * prior
posterior <- unstd.posterior / sum(unstd.posterior)
plot(p_grid, posterior, type = "b")
Writing the prior as a beta distribution, because the beta distribution is the conjugate prior distribution for the binomial likelihood.
If \(\alpha = \beta = 1\), the beta distribution is uniform between 0 and 1.
a <- 1
b <- 1
apost <- y + 1
bpost <- n - y + 1
plot(p_grid, dbeta(p_grid, apost, bpost))
nsims <- 10000
posterior_samples <- rbeta(nsims, apost, bpost)
mean(posterior_samples)
## [1] 0.4996092
median(posterior_samples)
## [1] 0.4980506
library("rstan")
## Loading required package: ggplot2
## Loading required package: StanHeaders
## rstan (Version 2.17.3, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
binomial_model <- stan_model("binomial.stan")
binomial_model
## S4 class stanmodel 'binomial' coded as follows:
## data {
## int n;
## int y;
## }
## parameters {
## real<lower=0, upper=1> theta;
## }
## model {
## y ~ binomial(n, theta);
## theta ~ beta(1, 1);
## }
binomial_fit <- sampling(binomial_model, data = list(y = y, n = n))
You can combine stan_model()
and sampling()
into one step by just running stan()
.
binomial_fit
## Inference for Stan model: binomial.
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## theta 0.50 0.00 0.14 0.23 0.40 0.50 0.60 0.77 1528 1
## lp__ -8.87 0.02 0.78 -11.01 -9.03 -8.58 -8.38 -8.32 1922 1
##
## Samples were drawn using NUTS(diag_e) at Thu Apr 12 00:38:17 2018.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
plot(binomial_fit)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)
For more plotting methods, we can use the bayesplot
package, which we will discuss in more detail later.
library("bayesplot")
## This is bayesplot version 1.5.0
## - Plotting theme set to bayesplot::theme_default()
## - Online documentation at mc-stan.org/bayesplot
mcmc_dens(as.array(binomial_fit), pars = "theta") + xlim(0, 1)
## Scale for 'x' is already present. Adding another scale for 'x', which
## will replace the existing scale.