This paper studies questionnaire design as a formal decision problem, focusing on one element of the design process: skip sequencing. We propose that a survey planner use an explicit loss function to quantify the trade-off between cost and informativeness of the survey and aim to make a design choice that minimizes loss. We pose a choice between three options: ask all respondents about an item of interest, use skip sequencing, thereby asking the item only of respondents who give a certain answer to an opening question, or do not ask the item at all. The first option is most informative but also most costly. The use of skip sequencing reduces respondent burden and the cost of interviewing, but may spread data quality problems across survey items, thereby reducing informativeness. The last option has no cost but is completely uninformative about the item of interest. We show how the planner may choose among these three options in the presence of two inferential problems, item nonresponse and response error.

%B Annals of Applied Statistics %I 2 %V 2 %P 264-285 %8 2008 Mar 01 %G eng %N 1 %2 PMC2858349 %4 Survey Design/Nonresponse/Response Error %$ 25300 %R 10.1214/07-aoas134 %0 Report %D 2005 %T Identification of Probability Distributions with Misclassified Data %A Molinari, Francesca %K Methodology %X This paper addresses the problem of data errors in discrete variables. When data errors occur, the observed variable is a misclassified version of the variable of interest, whose distribution is not identified. For many years econometricians have conceptualized the problem through convolution and mixture models. This paper introduces the direct misclassification approach. The approach is based on the observation that in the presence of classification errors, the relation between the distribution of the true but unobservable variable and its misclassified representation is given by a linear system of simultaneous equations, in which the coe.cient matrix is the matrix of misclassification probabilities. Formalizing the problem in these terms allows one to incorporate any prior information - e.g., validation studies, economic theory, social and cognitive psychology, or knowledge of the circumstances under which the data have been collected - into the analysis through sets of restrictions on the matrix of misclassification probabilities. Such information can have strong identifying power; the direct misclassification approach fully exploits it to derive identification regions for any real functional of the distribution of interest. The method readily extends to drawing inference on parameters of the conditional distribution of an outcome variable when the conditioning variable is misclassified. It is easy to implement and often computationally tractable. A method for estimating the identification regions is given, and illustrated with an empirical analysis of the distribution of pension plan types using data from the Health and Retirement Study. %B CAE Working Paper %I Cornell University %C Ithaca, NY %G eng %U http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.329.7446&rep=rep1&type=pdf %4 Misclassification/Identification Regions/Direct Misclassification Approach %$ 10922