Scenario: You are a data assistant to a newsroom editor. They want a one-page brief answering:
1) How has U.S. voter turnout changed since 1980?
2) Which measure— VEP turnout or VAP turnout—better matches ANES self-reports?
3) How do presidential and midterm elections compare on VEP turnout?
The R Markdown file for this lab can be found here for Part 1 and here for Part 2. The data for this lab can be found here.
Load the data into a data frame turnout
and quickly
verify its structure. Keep the vector of year
and the list
of column names.
Compute VAP-based turnout and VEP-based turnout (percent). Label both with the corresponding year so your later subsetting is self-documenting.
Name your vectors VAPtr
, VEPtr
.
Create the gaps between ANES self-reported turnout and each measure:
gapVAP
and gapVEP
. Summarize both and
decide which measure aligns better.
Define index vectors for presidential
(pres
) and midterm (mids
)
rows (years alternate). Subset VEPtr
into
pVEPtr
and mVEPtr
. Compute average VEP turnout
for each and the difference.
Let’s apply some of the syntax we’ve learned this week to some problems. This should be similar to Problem Set 1, but be warned that the data is different.
Your problem set is based on resources produced by Elena Llaudet and Kosuke Imai. It draws on the data and research in the following publication:
Raghabendra Chattopadhyay and Esther Duflo. 2004. “Women as Policy Makers: Evidence from a Randomized Policy Experiment in India.” Econometrica, 72 (5): 1409–43.)
Today our data will be completely made up. The scenario is as follows:
I hypothesize that the ingestion of spiders while asleep causes individuals’ propensity to buy blind boxes to increase. To tes this, I survey some number of individuals in my neighborhood, collecting data on:
adult
: whether they are an adult. =1
if
yes and =0
if no.spiders
: how many spiders they ingested in their sleep
the previous week.blindbox
: how many blindboxes they bought this
week.Read the CSV file “lab2data.csv”
into an object called df
. Read the first few observations
of the dataset.
Please provide your code for the following statistics:
We want to estimate the average causal effect of ingesting spiders on buying blind boxes.
We also wanted to estimate the average causal effect of being an adult on the propensity to buy blind boxes: