Part 1

Scenario: You are a data assistant to a newsroom editor. They want a one-page brief answering:
1) How has U.S. voter turnout changed since 1980?
2) Which measure— VEP turnout or VAP turnout—better matches ANES self-reports?
3) How do presidential and midterm elections compare on VEP turnout?


The R Markdown file for this lab can be found here for Part 1 and here for Part 2. The data for this lab can be found here.

Part A — Loading Data

Load the data into a data frame turnout and quickly verify its structure. Keep the vector of year and the list of column names.


Part B — Construct the two turnout measures

Compute VAP-based turnout and VEP-based turnout (percent). Label both with the corresponding year so your later subsetting is self-documenting.

Name your vectors VAPtr, VEPtr.


Part C — Reliability vs. ANES

Create the gaps between ANES self-reported turnout and each measure: gapVAP and gapVEP. Summarize both and decide which measure aligns better.


Part D — Presidential vs. Midterms using indices

Define index vectors for presidential (pres) and midterm (mids) rows (years alternate). Subset VEPtr into pVEPtr and mVEPtr. Compute average VEP turnout for each and the difference.


Part 2

Let’s apply some of the syntax we’ve learned this week to some problems. This should be similar to Problem Set 1, but be warned that the data is different.


Your problem set is based on resources produced by Elena Llaudet and Kosuke Imai. It draws on the data and research in the following publication:

Raghabendra Chattopadhyay and Esther Duflo. 2004. “Women as Policy Makers: Evidence from a Randomized Policy Experiment in India.” Econometrica, 72 (5): 1409–43.)

Today our data will be completely made up. The scenario is as follows:

I hypothesize that the ingestion of spiders while asleep causes individuals’ propensity to buy blind boxes to increase. To tes this, I survey some number of individuals in my neighborhood, collecting data on:

  1. adult: whether they are an adult. =1 if yes and =0 if no.
  2. spiders: how many spiders they ingested in their sleep the previous week.
  3. blindbox: how many blindboxes they bought this week.

Getting Started

Read the CSV file “lab2data.csv” into an object called df. Read the first few observations of the dataset.


Exploratory Data Analaysis

Please provide your code for the following statistics:


Methodology

We want to estimate the average causal effect of ingesting spiders on buying blind boxes.

  1. What would be the treatment variable?
  2. What would be the outcome variable?
  3. What would be the treatment group?
  4. What would be the control group?

We also wanted to estimate the average causal effect of being an adult on the propensity to buy blind boxes:

  1. What would be the treatment variable?
  2. What would be the outcome variable?
  3. What would be the treatment group?
  4. What would be the control group?