CS&SS/STAT 564

Bayesian Statistics for the Social Sciences

University of Washington, Spring 2018

Jeffrey B. Arnold
Instructor

Assistant Professor, Political Science; Core Faculty, CSSS
  • https://jrnold.com
  • jrnold@uw.edu
  • Smith 221B, Th 12–1 pm, 4–5 pm

Connor Gilroy
Teaching Assistant

Graduate Student, Political Science
  • https://soc.washington.edu/people/connor-gilroy
  • cgilroy@uw.edu
  • Savery 216B, Tu 9:30–11:30 am

Class Meetings

Tuesday 02:30–03:50 PM Savery 130
Lab Thursday 01:30–02:20 PM Savery 139
Thursday 02:30–03:50 PM Savery 130

Overview

Statistical methods based on the idea of probability as a measure of uncertainty. Topics covered include subjective notion of probability, Bayes’ Theorem, prior and posterior distributions, and data analysis techniques for statistical models.

In this course students will learn to

  • compare and contrast frequentist and Bayesian inferential approaches,
  • describe various methods to derive or approximate posterior distribution,
  • write and estimate Bayesian models with the Stan and R programming, languages.
  • implement and estimate univariate, regression, hierarchical, and measurement models using Stan and the R programming language,
  • evaluate and compare Bayesian models,
  • diagnose problems in Bayesian sampling methods, and
  • apply Bayesian methods to their own research problem.

Subject to time constraints, I expect to cover the following topics.

  • Bayes Theorem
  • Conjugate posterior distributions
  • Bayesian Inference
  • Bayesian computational methods
  • Introduction to Stan
  • Regression Models
  • Diagnosing sampling issues
  • Comparing models
  • Shrinkage and sparsity in Bayesian Models
  • Hierarchical Models
  • Measurement models

Prerequisites

Students should have completed the introductory quantitative methods sequence appropriate to their programs. They should be familiar with statistical inference, regression methods, and maximum likelihood. Courses SOC 504, SOC 505, SOC 506 or the equivalent will suffice.

It will be useful, but not required, for students to have a familiarity with the R programming language, or the programming background to learn it quickly. This is a computationally intensive and focused course, and some level of proficiency in programming prior to taking this course. To learn R, I suggest R for Data Science or DataCamp. Note that I will often use the tidyverse packages (including ggplot2 and dplyr).

Assignments

There are two main types of assignments for students:

  1. (Bi-Weekly) homeworks: Learning quantitative methods requires practice. As such, there will be approximately bi-weekly homework assignments. See the assignments page for more information.

  2. Research project: The final assignment will ask students to use Bayesian methods to estimate a model to answer a research question of their own.

Materials

Computational Tools

Students should have a laptop that they can bring to both class and lab as we will integrate computing with learning data analysis and statistics throughout the course.

This course will use R, which is a free and open-source programming language primarily used for statistics and data analysis. We will also use RStudio, which is an easy-to-use interface to R.

We will use git and GitHub for distributing, collecting, and commenting on assignments and projects. See the onboarding instructions for setting up git and GitHub. See the git and GitHub for links to help introductions and references on git and GitHub.

Books

The primary text for this course is:

Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, and Aki Vehtari. 2013. 3rd edition. Bayesian Data Analysis. Taylor & Francis Ltd.

It is important that you get the 3rd edition and not previous editions because it is has been updated in ways relevant to this course.

This will be supplemented with other readings and notes as indicated on the schedule.

Evaluation

The evaluation of Students will be based on the following.

  1. Assignments

    • initial submission
    • peer review
    • correction
  2. Final Project

  3. Peer Review. Students should contribute to the intellectual quality of the course by asking and answering questions on our Slack channel.

Communication

For questions regarding the content of the course, ask and answer them on our Slack channel. If you have a question about the topic, it is likely that someone else had the same question. Posting questions and answers publicly allows us all to learn from each of these questions and answers.

Reserve emails to the instructors for personal matters.

Errata

Changes

A summary of changes to the syllabus and schedule are posted in the CHANGELOG

Resources

Beyond what the teaching team can provide, there are several resources on campus that you can go to for assistance with data, computing, and statistical problems:

  • Center for Social Science Computing and Research (CSSCR) has a drop-in statistical consulting center in Savery 119. They provide consulting on statistical software, e.g. R. Go there for software or data related questions.
  • CSSS Statistical Consulting provides general statistical consulting. Go there for questions about statistical methods.
  • eScience Data Science Office Hours

License

Science should be open, and this course builds up other open licensed material, so unless otherwise noted, all materials for this class are licensed under a Creative Commons Attribution 4.0 International License.

Bugs

If you find any typos or other issues in this page, or any other page in the site, go to issues, click on the “New Issue” button to create a new issue, and describe the problem.