Due Date: April 11, 2025

Submit your lab here.

Problem 1: Analyzing Experimental Data: Difference in Means

In this lab, you will reanalyze the results of the experiment reported in Dunning and Harrison’s 2010 APSR article. The article and replication data are available.

First, use the difference in means to estimate one of the main treatment effects discussed in the paper (disregard compliance, i.e., subjects’ perception of the ethnicity of the politician and of whether a joking cousin relationship exists). Use randomization inference to find a P value for that difference. Discuss the results and any causal implications.

The R library used in class for randomization inference is ri2, and you can find its documentation and a vignette here.

One key step is to declare the research design for the randomization inference. This involves specifying the total sample size as well as the sample size per treatment group. You do this with the declare_ra() function, using the N parameter to set up the total sample size, the m parameter to specify the size of the treatment group in a two-group design, or the m_each parameter to provide a list of sample sizes for each treatment group in a multi-group design.

Then, you can use the conduct_ri() function to combine that declaration with a formula that provides a model, some data, and a null hypothesis.

Problem 2: Analyzing Experimental Data: Mediation

Estimate a mediation effect that helps clarify a hypothesis in the article, using what seems to you to be a reasonable set of control variables. Again, explain your model and assumptions using whatever form of reasoning makes sense for you. Are your results consistent with the arguments of the article?

The R library used for mediation in class is (intuitively enough) mediation, and you can find its documentation and a vignette here.

The essential steps to this process are, first, to create a data frame containing the variables necessary for your mediation analysis and to run na.omit on that data frame, because this analysis is not friendly to missing data. Second, run a regression predicting your mediator based on the treatment variable and any control variables you find appropriate. Third, run a regression predicting your outcome based on the mediator, the treatment variable, and any control variables you find appropriate. Third, run a command like the following:

mediator_output <- mediate(REGRESSION_PREDICTING_MEDIATOR, REGRESSION_PREDICTING_OUTCOME, treat="TREATMENT_VARIABLE", mediator = "MEDIATOR_VARIABLE")

You will likely want to run the summary and plot commands on the output of this command. You may also want to run the medsens command to check how sensitive the model is to omitted variables.