Pt 1: Reading in Data (20 pts) In the code block below read in our data for this problem set: the debt dataset we’ve been working with for the past week. Also be sure to change the name of this problem

PSET4

NAME

2024-11-08

In this pset we’re going to continue our study of the effects of eviction on American families. This week we discussed how we can use observational data to ask causal questions. In this problem set we’re going to practice identifying and controlling for confounding variables in order to get better estimates for the average treatment effect.

Pt 1: Reading in Data (20 pts)

In the code block below read in our data for this problem set: the debt dataset we’ve been working with for the past week. Also be sure to change the name of this problem set to your name both in the header above and with the filename. Remove NA observations in the debt dataset.

“‘{r}

# Pt 2: Identifying and Transforming Treatment and Outcome Variables (10 pts)

In this problem set we are interested in estimating the average causal effect of eviction on focal child

Pt 1: The code below creates a new variable in our debt dataset called “unsafe_perception”. This variabl

Note: Before you can compile this document you need to write code in the section above to read in the de

“`{r}

debt$unsafe_perception <- ifelse(debt$c.neigh_unsafe_day.15 == “Strongly agree” | debt$c.neigh_unsafe_nt.15 == “Strongly agree”, 1, 0)

Pt 2: In the bulleted list below, identify the treatment variable and the outcome variable for our research question. Identify both the name of the variable and give a substantive interpretation:

•     Treatment:

•     Outcome:

Pt 3: Difference in Means vs. LM (10 Pts)

In this question we’re going to show that the difference-in-means estimate is equivalent to the slope on the treatment variable with simple linear regression.

Pt 1: Compute the difference-in-means estimate for our research question (the average causal effect of eviction on focal child’s perception of neighborhood safety). For the focal child’s perceptions be sure to use the new variable we created above. The estimate you should get is .075. You need to write code to produce this estimate.

Pt 2: Use the lm() function to fit a line to the data and summarize the relationship between the outcome variable Y and the treatment variable X.

“‘{r}

# Pt 4: Fitted Line (10 Pts)

What is the fitted line? In other words, provide the formula $hat{Y} = hat{alpha} + hat{beta}X$ whe

Note: it’s okay if your fitted line doesn’t have “hats.” You can just write the name of the variable in

# Pt 5: Interpretation (10 Pts)

Please provide a full substantive interpretation of the estimated slope coefficient and the estimated in

# Pt 6: Identifying Confounders (10 Pts)

Identify at least one variable in the debt dataset which may **confound** the relationship between evict

# Pt 7: Controlling for Confounders (10 Pts)

Estimate the average causal effect of eviction on childhood perceptions of safety, while controlling for

Hint: to fit a multiple linear regression model in R, we can use the function lm() and specify the main “`{r}

Fitted Line (10 Pts)

What is the fitted line? In other words, provide the formula Yˆ = αˆ + βˆ1X1 + βˆ2X2 where you specify each term (i.e. substitute Y for the name of the outcome variable, substitute αˆ for the estimated value of the intercept coefficient, substitute both βˆ terms for the estimated value of the slope coefficients, and substitute both Xs for the name of the treatment variable and confounder).

Note: If you choose a non-binary categorical variable for your confounder you will get more than two coefficients in your regression line. In lecture we discussed how to write and interpret a regression line with these additional coefficients.

Interpretation (10 Pts)

Assuming the variable you identified is the only confounding variable, what is the average causal effect of eviction on child perceptions of neighborhood safety? Please write a full sentence answering the question, including the assumption, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect.

Extra Credit (10 Pts)

Describe below one other variable not included in the debt dataset that might confound the relationship between eviction and child perceptions of safety. We call these variables unobserved variables. Explain why this variable may confound the relationship. Remember, for a variable to confound a causal relationship it has to both have an effect on the treatment variable and have an effect on the outcome variable. Describe how this confounding variable may do this.

Similar Posts