As a survey methodologist I get paid to develop survey methods that generaly minimize survey errors, and advise people on how to field surveys in a specific setting. A question that has been bugging me for a long time is what survey error we should worry about most. The Total Survey Error (TSE) framework is very helpful for thinking which type of survey error may impact survey estimates
But which error source is generally larger? Nonresponse or measurement errors?
Thankfully, no one has ever asked me this question yet, because I would find it impossible to answer anything then "well, that depends".
The reason why we don't know what error source is larger is that we can usually assess observational errors only for the people we have actually observed. There are several ways to do this. Sometimes we know the truth, and so we can compare survey answers ("do you have a valid driver's license?") to data that we know from administrative records. If we are interested in attitudes, we can use psychometric models. The people behind the computer programme SQP have summarised a huge number of question experiments and MTMM models to predict the quality of a specific survey questions. By asking different forms of the same question (e.g. how interested are you in politics?") we can gauge the reliability and validity of this question under different question wordings and answer scales.
The problem of course is that if we are indeed interested in the concept "interest in politics", we would ideally also like to know what people who we have not observed would have answered. In order to estimate errors of non-observation (nonresponse), we would need to actually observe these people!
There are of course some situations where we actually do know something about nonrespondents. Cannell and Fowler (1963) are an early example: they knew something about nonrespondents (hospital visits) and could compare different respondent and nonrespondent groups. A more recent great example is by Kreuter, Muller and Trappmann (2010). They did a survey among people for whom they already knew their employment status. They showed that nonresponse and measurement error in employment status were about of equal size, and go in different directions.
There are several other studies, among students or in the context of mixed-mode studies that have looked at factual questions, and estimated both measurement and nonresponse error in the same study. So, what do we learn? From my reading of the literature, there is no clear pattern in findings. Sometimes measurement errors are larger, sometimes nonresponse is larger. And sometimes these survey errors go in the same direction, and sometimes in different directions. A further problem is that these validation studies use factual questions, not attitudinal questions, which surveys are more often interested in. In conclusion, that means that:
1. For factual questions, is is not clear whether nonresponse or measurement errors are the larger problem. There is large variation across studies.
2. Because the measurement quality of attitudinal questions is generally lower than that of factual questions, measurement errors may pose a relatively larger problem than nonresponse in attitudinal questions.
3. BUT, we then have to assume that nonresponse bias is generally the same for attitudinal and factual questions, which may not be true. Stoop (2005) and others have shown that if you are interested in measuring in " interest in politics", late and hard-to-reach respondents are very different from early and easy respondents.
So, what to do? How do we make progress so that I can at some point give an answer to the question which error sourcewe should worry about most?
1. We could find studies with a very high response rate (100% ideally) and study the differences between the easy and late respondents, like Stoop did.
2. We should do more validation studies for factual questions, which should become more feasible, as more and more register data are available.
3. And, we should try to link MTMM studies and other psychometric models to nonresponse models. I recently did a study that did this for a panel study, but what is really needed is work in cross-sectional studies.