Class Meetings
Tuesday | 02:30–03:50 PM | Savery 130 | |
Lab | Thursday | 01:30–02:20 PM | Savery 139 |
Thursday | 02:30–03:50 PM | Savery 130 |
Overview
Statistical methods based on the idea of probability as a measure of uncertainty. Topics covered include subjective notion of probability, Bayes’ Theorem, prior and posterior distributions, and data analysis techniques for statistical models.
In this course students will learn to
- compare and contrast frequentist and Bayesian inferential approaches,
- describe various methods to derive or approximate posterior distribution,
- write and estimate Bayesian models with the Stan and R programming, languages.
- implement and estimate univariate, regression, hierarchical, and measurement models using Stan and the R programming language,
- evaluate and compare Bayesian models,
- diagnose problems in Bayesian sampling methods, and
- apply Bayesian methods to their own research problem.
Subject to time constraints, I expect to cover the following topics.
- Bayes Theorem
- Conjugate posterior distributions
- Bayesian Inference
- Bayesian computational methods
- Introduction to Stan
- Regression Models
- Diagnosing sampling issues
- Comparing models
- Shrinkage and sparsity in Bayesian Models
- Hierarchical Models
- Measurement models
Prerequisites
Students should have completed the introductory quantitative methods sequence appropriate to their programs. They should be familiar with statistical inference, regression methods, and maximum likelihood. Courses SOC 504, SOC 505, SOC 506 or the equivalent will suffice.
It will be useful, but not required, for students to have a familiarity with the R programming language, or the programming background to learn it quickly. This is a computationally intensive and focused course, and some level of proficiency in programming prior to taking this course. To learn R, I suggest R for Data Science or DataCamp. Note that I will often use the tidyverse packages (including ggplot2 and dplyr).
Assignments
There are two main types of assignments for students:
(Bi-Weekly) homeworks: Learning quantitative methods requires practice. As such, there will be approximately bi-weekly homework assignments. See the assignments page for more information.
Research project: The final assignment will ask students to use Bayesian methods to estimate a model to answer a research question of their own.
Materials
Computational Tools
Students should have a laptop that they can bring to both class and lab as we will integrate computing with learning data analysis and statistics throughout the course.
This course will use R, which is a free and open-source programming language primarily used for statistics and data analysis. We will also use RStudio, which is an easy-to-use interface to R.
We will use git and GitHub for distributing, collecting, and commenting on assignments and projects. See the onboarding instructions for setting up git and GitHub. See the git and GitHub for links to help introductions and references on git and GitHub.
Books
The primary text for this course is:
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, and Aki Vehtari. 2013. 3rd edition. Bayesian Data Analysis. Taylor & Francis Ltd.
It is important that you get the 3rd edition and not previous editions because it is has been updated in ways relevant to this course.
This will be supplemented with other readings and notes as indicated on the schedule.
Evaluation
The evaluation of Students will be based on the following.
Assignments
- initial submission
- peer review
- correction
Final Project
Peer Review. Students should contribute to the intellectual quality of the course by asking and answering questions on our Slack channel.
Communication
For questions regarding the content of the course, ask and answer them on our Slack channel. If you have a question about the topic, it is likely that someone else had the same question. Posting questions and answers publicly allows us all to learn from each of these questions and answers.
Reserve emails to the instructors for personal matters.
Errata
Changes
A summary of changes to the syllabus and schedule are posted in the CHANGELOG
Resources
Beyond what the teaching team can provide, there are several resources on campus that you can go to for assistance with data, computing, and statistical problems:
- Center for Social Science Computing and Research (CSSCR) has a drop-in statistical consulting center in Savery 119. They provide consulting on statistical software, e.g. R. Go there for software or data related questions.
- CSSS Statistical Consulting provides general statistical consulting. Go there for questions about statistical methods.
- eScience Data Science Office Hours
License
Science should be open, and this course builds up other open licensed material, so unless otherwise noted, all materials for this class are licensed under a Creative Commons Attribution 4.0 International License.
Bugs
If you find any typos or other issues in this page, or any other page in the site, go to issues, click on the “New Issue” button to create a new issue, and describe the problem.