close× Call Us
close×
Title Investigator Characteristics and Respondent Behavior in Online Surveys
Post date 08/13/2015
C1 Background and Explanation of Rationale A long strain of research in political science has shown how the responses of interviewees in face-to-face and telephone surveys can vary depending on the race or gender of the interviewer (Davis, 1997; Davis & Silver, 2003; Hatchett & Schuman, 1975; Cotter et al., 1982; Reese et al., 1986; Huddy et al., 1997). This variation means that the inferences researchers draw and the replicability of the study can depend on who runs the study. Regardless of whether the relationship between characteristics of the interviewer and the observed variation in responses is perceived to be a benefit 1 or a threat, 2 it is useful for researchers to be cognizant of the circumstances in which we might expect this relationship to be most likely to occur. For example, it is often suggested that responses in online surveys are less likely to be affected by attributes of the researcher than in other survey methods. In this paper we explore how attributes of the researcher affect responses in online surveys. In particular, we use a survey experiment that explicitly manipulates the race and gender cued by the researcher name on the informed consent page. The informed consent page is generally required by the Institutional Review Board (IRB) of research universities to be displayed at the start of every internet-based survey. Manipulating the researcher name allows us to test how information conveyed about the race and gender of the researcher through the informed consent page affects survey responses. We focus on gender and race because these two factors can be clearly conveyed through names (Bertrand & Mullainathan, 2004; Milkman et al., 2012), and have been central to the existing literature on surveys and identity. The experiment is a 2x2 factorial design, where the first factor is the putative gender of the investigator (male or female) and the second is the investigator’s putative race (white or black).
C2 What are the hypotheses to be tested? H1 : Assignment to an investigator name that is commonly perceived to be female/black will increase reported support for policies that provide for and protect the rights of women/blacks, and decreases responses that indicate prejudice against these groups. H2 : Attention and effort will be greatest among subjects assigned to a putatively white, male investigator.
C3 How will these hypotheses be tested? * Study 1: For Hypothesis 1, we estimate two separate treatment effects. The first is the effect of assignment to a putatively female name on the probability that a respondent indicates that they believe that women should have an equal role in the workforce. The second is the effect of assignment to a putatively black name on the respondent’s racial resentment scale. We expect that the effect for the former will be positive while the latter will be negative. For Hypothesis 2, we will estimate the effect of assignment to a putatively white and male name on the probability that a respondent correctly completes both of the attention check assignments. We expect this effect estimate to be positive. For estimation, we will fit a linear probability model of the outcome on treatment and compute standard errors via a nonparametric bootstrapping procedure. While not needed for identification, we will include respondent-level covariates (e.g. gender, income, education) in the regression model in order to increase the efficiency of our estimator. Because respondents have the option to stop taking the survey after treatment is assigned, there is concern that an analysis conditional on survey completion will be biased for the average treatment effect if treatment also affects the probability that a respondent will drop out. To obtain unbiased treatment effect estimates in this situation, we adopt an estimation strategy similar to that of Rotnitzky & Robins (1995) and weight each respondent observation in the outcome regression by its estimated probability of not dropping out of the sample. We estimate this probability via a logistic regression of completion on treatment using the entire set of respondents (those that both completed and did not complete the survey). Our rejection levels for two-sided hypothesis tests of whether the average treatment effects differ from zero are calibrated to correct for problems of multiple testing. We are willing to tolerate an overall Type I error rate of α = .05. With three main hypothesis tests, we could obtain a conservative rejection threshold for each individual hypothesis test of .05/3 = .017 using the Bonferroni correction. This controls the Familywise Type I Error Rate and guarantees that the probability of any single erroneous rejection in the set of tests is less than or equal to .05. However, this approach sacrifices a significant amount of power. A less conservative but more powerful approach is to set a rejection threshold to control the False Discovery Rate (FDR). We use the Benjamini-Hochberg procedure to set a rejection level for the hypothesis tests (Benjamini & Hochberg, 1995). This entails a two-step procedure where we order the 3 p-values of the individual hypothesis tests from smallest to largest, p (1) , ..., p (3) and then set our rejection level to p (k) , where k is the largest value of i that satisfies p (i) ≤ 3 i α. This procedure controls the expected share of false hypothesis rejections out of the total number of rejections to be no greater than .05. We do not specify any ex-ante interactions of the treatment effects with baseline covari- ates. However, because the mechanism through which any treatment effects operate are of significant interest, we will conduct exploratory analyses of potential treatment effect hetero- geneity by estimating models with interactions between treatment and respondent identity variables. Among other interactions, we are interested in seeing whether any average treat- ment effect is primarily driven by behavior changes among men (in the case of the gender treatment) and white respondents (in the case of the race treatment). We will attempt to replicate any promising results from these exploratory analyses in a follow-up experiment that explicitly registers interactive hypotheses prior to the experiment. In this preregistration plan, we outline an experiment that tests whether or not investigator characteristics have an effect on subjects responses and subject effort. This design permits direct tests of these hypotheses. In addition to these primary hypothesis tests, we hope to conduct an exploratory analyses of heterogeneous treatment effects, which will serve as the basis for a second experiment to test the mechanisms that we hope to identify in the experiment laid out in this plan. The second experiment will be preregistered separately, given that its design depends on the results of the experiment outlined here. Study 2: For Hypothesis 1, we estimate two separate treatment effects. 3 The first is the effect of assignment to a putatively female name on the probability that a respondent indicates that they believe that women should have an equal role in the workforce. The second is the effect of assignment to a putatively black name on the respondent’s racial resentment scale. We expect that the effect for the former will be positive while the latter will be negative. For Hypothesis 2, we will estimate the effect of assignment to a putatively white and male name on the probability that a respondent correctly completes both of the attention check assignments. We expect this effect estimate to be positive. For estimation, we will fit a linear probability model of the outcome on treatment and compute standard errors via a nonparametric bootstrapping procedure. While not needed for identification, we will include respondent-level covariates (e.g. gender, income, education) in the regression model in order to increase the efficiency of our estimator. Our rejection levels for two-sided hypothesis tests of whether the average treatment effects differ from zero are calibrated to correct for problems of multiple testing. We are willing to tolerate an overall Type I error rate of α = .05. With three main hypothesis tests, we could obtain a conservative rejection threshold for each individual hypothesis test of .05/3 = .017 using the Bonferroni correction. This controls the Familywise Type I Error Rate and guarantees that the probability of any single erroneous rejection in the set of tests is less than or equal to .05. However, this approach sacrifices a significant amount of power. A less conservative but more powerful approach is to set a rejection threshold to control the False Discovery Rate (FDR). We use the Benjamini-Hochberg procedure to set a rejection level for the hypothesis tests (Benjamini & Hochberg, 1995). This entails a two-step procedure where we order the 3 p-values of the individual hypothesis tests from smallest to largest, p (1) , ..., p (3) and then set our rejection level to p (k) , where k is the largest value of i that satisfies p (i) ≤ 3 i α. This procedure controls the expected share of false hypothesis rejections out of the total number of rejections to be no greater than .05. We do not specify any ex-ante interactions of the treatment effects with baseline covariates. However, because the mechanism through which any treatment effects operate are of significant interest, we will conduct exploratory analyses of potential treatment effect heterogeneity by estimating models with interactions between treatment and respondent identity variables. Among other interactions, we are interested in seeing whether any average treat- ment effect is primarily driven by behavior changes among men (in the case of the gender treatment) and white respondents (in the case of the race treatment). We will attempt to replicate any promising results from these exploratory analyses in a follow-up experiment that explicitly registers interactive hypotheses prior to the experiment. Additionally, because respondents have the option to stop taking the survey after treat- ment is assigned but before outcomes are measured, there is concern that an analysis conditional on survey completion will be biased for the average treatment effect if treatment also affects the probability that a respondent will drop out. Although it is not possible to adjust for nonignorable drop-out in the absence of prior covariates on respondents, we will examine whether there appears to be systematic differences between treatment arms with respect to attrition and employ sensitivity analyses in the vein of Scharfstein et al. (1999) in order to evaluate the robustness of our estimates to this potential source of bias.
C4 Country United States
C5 Scale (# of Units) not provided by authors
C6 Was a power analysis conducted prior to data collection? Yes
C7 Has this research received Insitutional Review Board (IRB) or ethics committee approval? Yes
C8 IRB Number Study 2: IRB15-2260
C9 Date of IRB Approval Study 2: 7/17/2015
C10 Will the intervention be implemented by the researcher or a third party? Researchers
C11 Did any of the research team receive remuneration from the implementing agency for taking part in this research? No
C12 If relevant, is there an advance agreement with the implementation group that all results can be published? No
C13 JEL Classification(s) not provided by authors