Two models of inference: Design vs. model based

Introduction

The majority of studies in the social, behavioral and medical sciences uses some form of survey data. Often surveys based on a small sample are used to say something about a larger population. This step is called inference. Many of the developments in statistics in the 20th century centered around the developments of valid inference procedures. For example, p-values and Confidence Intervals are designed to reflect the uncertainty that surrounds the use of a small sample for saying something about the larger population. This model of inference is called design-based inference. Crucial in design based inference is the process of drawing a random sample in a controlled way from the population. A second model of inference does not a neat random sample, but uses ‘found’ data to do inference, and relies on statistical modeling to model the relation between the sample and population. In this week, we will use the example of the 2016 U.S. Presidential Election to illustrate why there is a renewed 21st century battle between the two paradigms in how to do inference.

Literature

Optional:

  • Blumenthal, M., Clement, S., Clinton, J. D., Durand, C., Franklin, C., Miringoff, L., … & Witt, G. E. (2017).

Slides

Slides

Exercises

Class exercise

Take home exercise

Take home exercise

Next