Tuesday, January 29, 2013

Planned Missingness



I recently gave a talk at an internal seminar on planned missingness for a group of developmental psychologists. The idea behind planned missingness is that you can shorten interview time or reduce costs, if you decide as a researcher not to administer all your instruments to everyone in your sample. When you either randomly assign people to receive a particular instrument, or do so by design (i.e. only collect bio-markers in an at-risk group), your missing data will either be Missing Completely At Random (MCAR) or Missing at Random (MAR). Both situations of missing data can easily be 'solved' after the data collection, so you save costs on the one hand, and still obtain a 'complete' dataset for answering your research questions.

Click to enlarge
           Discrete data collection with fixed intervals                        Continuous data collection with varying intervals

Planned missingness has been implemented in some cross-sectional surveys. I think it has great potential for panel surveys too, when you randomly assign respondents to become respondent in an entire waves (or not). This can reduce costs or lead to increased statistical power when one decides to invest the saved costs in more longitudinal measurements. If you would stretch this idea, you can end up in a situation of continuous data collection, where the interval between waves is randomized for respondents. A methodological setup like this allows all kinds of new research to be answered. For example the existence and size of test-effects (panel conditioning). Furthermore, models that include time (change, duration, cox-regression, survival models), can be estimated more precisely. And with the advent of Internet surveys not difficult to implement. I don't know any real panel study that ever did this however (please let me know if I'm wrong here).

The developmental psychologists did not seem too enthusiastic about my ideas last week. Too bad. I would love to collaborate on a study that really does this to show it really works in practice if all is logistically well-organized. Slides can be found here 

p.s. 04-Feb-13 One of the readers rightly pointed me to the fact that rotating panel surveys use this principle to 'refresh' their sample in every wave. Cohort studies also do this regularly. Even in such panel survey designs, measurements are taken at fixed time points though, and the rotation itself usually revolves around either measuring or dropping entire cohorts at once. I think that even such designs can improve by using the randomness inherent in planned missingness designs to make data collection more efficient and cost effective.