Part 4: Randomized experiments and t-tests

1. Setup

Once again, we start by setting the working directory and loading the LaLonde (1986) data:

. cd "~/git_repos/metricsinstata/docs/part4"
/Users/jack/git_repos/metricsinstata/docs/part4

. use "nsw.dta", clear

Recall that this is data from a randomized trial in which some individuals are given employment training.

We would like to test whether the training affected the earnings of participants.

2. Checking for effective randomization

When confronted with data from a randomized experiment, one of the first checks should be to verify that the randomization has been effective.

A simple way to do that is by testing whether pre-determined characteristics differ by the treatment and the control group. In the below code, I test whether average pre-treatment (1975) earnings differ by treatment status. I do this by running a t-test:

. ttest re75, by(treat)

Two-sample t test with equal variances
─────────┬────────────────────────────────────────────────────────────────────
   Group │     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
─────────┼────────────────────────────────────────────────────────────────────
       0 │     425    3026.683    252.2977     5201.25    2530.773    3522.593
       1 │     297    3066.098    282.8697    4874.889    2509.407    3622.789
─────────┼────────────────────────────────────────────────────────────────────
combined │     722    3042.897    188.5423    5066.143    2672.739    3413.054
─────────┼────────────────────────────────────────────────────────────────────
    diff │           -39.41544    383.4172               -792.1647    713.3338
─────────┴────────────────────────────────────────────────────────────────────
    diff = mean(0) - mean(1)                                      t =  -0.1028
Ho: diff = 0                                     degrees of freedom =      720

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4591         Pr(|T| > |t|) = 0.9182          Pr(T > t) = 0.5409

The ttest command returns a lot of information. We see for example that mean earnings in the treated group were 39 dollars higher than those of the control group. However, this is small relative to its standard error.

The bottom of the output contains the p-values for a variety of hypotheses. We see that we cannot reject the null hypothesis of the two means being equal. The p-value for this particular null hypothesis is 0.9182.

At least based on this variable, it seems that treatment has been effectively randomized. One could perform equivalent tests for all pre-determined variables available. To be thorough, one ought to adjust for multiple hypothesis testing but I will not address that here.

3. Estimating a treatment effect

Given the above evidence that randomization has been effective, I now compare outcomes between the treated and control group. I will run a t-test to test whether post-experiment (1978) earnings differ between the treatment and the control group

. ttest re78, by(treat)

Two-sample t test with equal variances
─────────┬────────────────────────────────────────────────────────────────────
   Group │     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
─────────┼────────────────────────────────────────────────────────────────────
       0 │     425    5090.048     277.368    5718.089    4544.861    5635.236
       1 │     297    5976.352    401.7594    6923.796    5185.685    6767.019
─────────┼────────────────────────────────────────────────────────────────────
combined │     722    5454.636    232.7105    6252.943    4997.765    5911.507
─────────┼────────────────────────────────────────────────────────────────────
    diff │           -886.3037    472.0863               -1813.134    40.52635
─────────┴────────────────────────────────────────────────────────────────────
    diff = mean(0) - mean(1)                                      t =  -1.8774
Ho: diff = 0                                     degrees of freedom =      720

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0304         Pr(|T| > |t|) = 0.0609          Pr(T > t) = 0.9696

We can reject the null hypothesis of equality of means at the 10% but not the 5% level. We would usually say that the difference is marginally significant.

The assumptions required for a causal interpretation are:

Randomization
Stable unit treatment value (Rubin, 1978)

For more details, see Athey and Imbens (2016)

Under these assumptions, we can interpret the difference as an estimate of the average treatment effect on the treated (ATET). As the treatment group have earnings that are $887 higher than the control group on average, we estimate an ATET of $887, which is marginally statistically significant.

Based on this simple analysis, we conclude that on average the training programme increased earnings of participants by $887.