Homework 4

Load/Setup STRAN

library(rstan) # observe startup messages

## Loading required package: StanHeaders

## Loading required package: ggplot2

## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)

## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)

options(mc.cores = parallel::detectCores()) # Use all local cores (does this work for all r packages?)
rstan_options(auto_write = TRUE) # saves stan to hard drive
Sys.setenv(TZ = "America/Toronto") # this deals with a time zone bug issue that was causes an error when running the model
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

PART 1

OK Here’s another HW (I know I really gotta give these to you on Thursdays… Like always, just do what you can, and let me know if you have any questions. Save all of your fitted models in .RData files so we can look at them on Thursday without re-fitting them.

We’re going to introduce something non-identifiable into the model you fit, and see what happens. The new model will be Y=b0+(b1a+b1b)pretest+b2FH2T + b3Dragon+ b4ASSISTments+epsilon

We’ll be building off the file you wrote last time, HW3stanScript2.stan. First, make the following changes: In the parameters section, replace real b1; with real b1a; real b2b; DONE

between the “parameters” and the “model” sections, add in a new section called “transformed parameters” that looks like this: transformed parameters { real b1; b1=b1a+b1b; } DONE

finally, in the “model” section, replace + b1pretest_MC with + (b1a+b1b)pretest_MC DONE

You might want to comment out the “generated parameters” section, too. We won’t use it for this (I don’t think that leaving it in will hurt though).

stan model 1

data {
  int<lower=0> N;
  vector[N] y;
  vector[N] pretest_MC;
  vector[N] FH2T;
  vector[N] Dragon;
  vector[N] ASSISTments;
}

parameters {
  real<lower=0> sigma;
  real b0;
  real b1a;
  real b1b;
  real b2;
  real b3;
  real b4;
}

transformed parameters {
  real b1;
  b1=b1a+b1b;
}

//Y=b0+(b1a+b1b)*pretest+b2*FH2T + b3*Dragon+ b4*ASSISTments+epsilon

model {
  y ~ normal(b0 + (b1a+b1b)*pretest_MC + b2*FH2T + b3*Dragon + b4*ASSISTments
, sigma);
}

Now fit the new model. This might take a while–maybe start it before you go to sleep at night and check it in the morning. Also, save the fitted model, like save(fit3,file=“unidentifiedFit.RData”) or whatever.

Prepare data

assess <-read.csv("Assessment_merged_2021_07_16_state_assessment_N=4321 - Sheet1.csv", na.strings = c(""))
# Clean data
assess <- assess %>%
  filter(is.na(post.percentage_math_score) == F,
         is.na(pre.percentage_math_score) == F,
         
    # remove students who are not in original random assignment
    is.na(rdm_condition) == F,
    # remove students who are in resource
     assess$rdm_condition != "Dragon-Resource" & assess$rdm_condition != "FH2T-Resource",
    # S03 drop schools 
     final_school_id != "S03" & initial_school_id != "S03"
    & final_school_id != "S07" & initial_school_id != "S07"

  )

assess$pre.percentage_math_score_MC <- assess$pre.percentage_math_score- mean(assess$pre.percentage_math_score)

# Dummy Code
table(assess$rdm_condition)

## 
##     ASSISTments             BAU          Dragon Dragon-Resource            FH2T 
##             380             366             349               0             753 
##   FH2T-Resource 
##               0

assess$FH2T <- ifelse(assess$rdm_condition == "FH2T", 1, 0)
assess$ASSISTments <- ifelse(assess$rdm_condition == "ASSISTments", 1, 0)
assess$Dragon <- ifelse(assess$rdm_condition == "Dragon", 1, 0)

# Save data for stan
stan_data <-  list(N=nrow(assess), 
                   y=assess$post.percentage_math_score,
                   pretest_MC=assess$pre.percentage_math_score_MC,
                   FH2T = assess$FH2T,
                   ASSISTments = assess$ASSISTments,
                   Dragon = assess$Dragon)

Fit Part 1 Model

#fit1 <- stan(file = 'HW4stanScript1.stan', data = stan_data)

Notes: * R-hat 2.57, * chains are not mixed -> means that the individal MCMC Chains did not converge

Save Part 1 Model

#saveRDS(fit1,file="unidentifiedFit.rds") 
fit1<-readRDS("unidentifiedFit.rds")

Once the model has fit, look at the estimates, Rhats, and traceplots for the coefficients and sigma. What do you notice? Use the extract() function to look at the MCMC samples. make a scatterplot of the samples for b1a against those for b1b–why do you think you get that pattern?

Evaluate Model 2

print(fit1)

## Inference for Stan model: HW4stanScript1.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##           mean se_mean     sd     2.5%      25%      50%      75%    97.5%
## sigma    22.44    0.09   0.32    21.80    22.22    22.44    22.69    23.01
## b0       43.67    0.26   1.11    41.66    42.96    43.66    44.36    46.20
## b1a      20.78  138.51 208.09  -362.05  -140.25    93.03   174.26   305.75
## b1b     -20.09  138.51 208.08  -305.05  -173.57   -92.38   140.93   362.75
## b2        2.50    0.32   1.32    -0.36     1.76     2.52     3.37     5.04
## b3        3.59    0.40   1.47    -0.27     2.96     3.65     4.51     6.23
## b4        1.08    0.31   1.40    -1.96     0.25     1.03     1.97     3.81
## b1        0.69    0.00   0.02     0.65     0.68     0.69     0.70     0.73
## lp__  -6664.23    0.36   1.63 -6668.25 -6665.13 -6663.94 -6663.02 -6661.95
##       n_eff Rhat
## sigma    13 1.20
## b0       18 1.16
## b1a       2 4.14
## b1b       2 4.14
## b2       18 1.24
## b3       14 1.23
## b4       21 1.14
## b1     4384 1.00
## lp__     20 1.12
## 
## Samples were drawn using NUTS(diag_e) at Wed Oct  6 20:50:55 2021.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

How do I evaluate the Coefficients For example: b2 CIs (maybe that is not what they are) overlap zero. Does this mean that we can’t say that it is 95% probable that the effect of FH2T is not zero? I’m I think about this incorrectly because are not doing null hypotheses testing.

traceplot(fit1,inc_warmup =F)

Part 2

Part 2: the magic of a prior Take the unidentified model, and add two lines to the beginning of the “model” section: b1a~std_normal(); b1b~std_normal(); These put a N(0,1) prior on b1a and b1b. Now refit the model (it should go much faster), and repeat the diagnostics from last time. What happened?

stan model 2

data {
  int<lower=0> N;
  vector[N] y;
  vector[N] pretest_MC;
  vector[N] FH2T;
  vector[N] Dragon;
  vector[N] ASSISTments;
}

parameters {
  real<lower=0> sigma;
  real b0;
  real b1a;
  real b1b;
  real b2;
  real b3;
  real b4;
}

transformed parameters {
  real b1;
  b1=b1a+b1b;
}

//Y=b0+(b1a+b1b)*pretest+b2*FH2T + b3*Dragon+ b4*ASSISTments+epsilon

model {
  b1a~std_normal();
  b1b~std_normal();
  y ~ normal(b0 + (b1a+b1b)*pretest_MC + b2*FH2T + b3*Dragon + b4*ASSISTments
, sigma);
}

Fit Part 2 Model

 fit2 <- stan(file = 'HW4stanScript2.stan',
             data = stan_data)
 saveRDS(fit2,file="Fit.rds")

Once the model has fit, look at the estimates, Rhats, and traceplots for the coefficients and sigma. What do you notice?

Evaluate Model 2

print(fit2)

## Inference for Stan model: HW4stanScript2.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##           mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff
## sigma    22.39    0.01 0.37    21.67    22.14    22.38    22.63    23.15  1992
## b0       43.82    0.03 1.14    41.60    43.03    43.85    44.60    46.06  1252
## b1a       0.32    0.02 0.69    -1.02    -0.14     0.32     0.78     1.64  1868
## b1b       0.37    0.02 0.69    -0.95    -0.08     0.37     0.84     1.70  1869
## b2        2.24    0.03 1.38    -0.36     1.28     2.21     3.19     4.96  1571
## b3        3.39    0.04 1.64     0.33     2.25     3.42     4.52     6.50  1638
## b4        1.23    0.04 1.62    -1.90     0.17     1.20     2.34     4.35  1840
## b1        0.69    0.00 0.02     0.65     0.68     0.69     0.70     0.73  4274
## lp__  -6665.10    0.05 1.86 -6669.51 -6666.12 -6664.81 -6663.70 -6662.46  1355
##       Rhat
## sigma    1
## b0       1
## b1a      1
## b1b      1
## b2       1
## b3       1
## b4       1
## b1       1
## lp__     1
## 
## Samples were drawn using NUTS(diag_e) at Thu Oct  7 06:24:22 2021.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

Use the extract() function to look at the MCMC samples. make a scatterplot of the samples for b1a against those for b1b–why do you think you get that pattern?

samp <- extract(fit2, pars="yhat",include=FALSE) # they are the opposite of one another
plot(samp$b1a, samp$b1b)
cor.test(samp$b1a, samp$b1b)

## 
##  Pearson's product-moment correlation
## 
## data:  samp$b1a and samp$b1b
## t = -2224.2, df = 3998, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9996204 -0.9995703
## sample estimates:
##        cor 
## -0.9995962

table(samp$b1a + samp$b1b == samp$b1)

## 
## TRUE 
## 4000

  plot(samp$b1a, samp$b1b)

traceplot(fit2,inc_warmup =F)

Extra credit:

(I mean, I’m not actually grading this with points but whatever) Use the MCMC samples from extract() in the 2nd model you just fit (the one with the normal priors) to answer these questions:

(if you don’t know how to do this immediately, try to figure it out, just by thinking–you don’t need to look stuff up. If the answer doesn’t come to you, we’ll work it out together on Thursday)

What’s the posterior probability that BAU (assistments with delayed feedback) is worse than FH2T?

table(samp$b0 > samp$b0 + samp$b2)

## 
## FALSE  TRUE 
##  3804   196

204/(3796  + 204)

## [1] 0.051

mean(samp$b0 > samp$b0 + samp$b2)

## [1] 0.049

What’s the posterior probability BAU is worse than all the other conditions? (ie. Pr(BAU is worse than assistments & BAU is worse than FH2T & BAU is worse than dragonbox|data))

table(samp$b0 > samp$b0 + samp$b2 
      & samp$b0 > samp$b0 + samp$b3
      & samp$b0 > samp$b0 + samp$b4)

## 
## FALSE  TRUE 
##  3971    29

18/(3982  +  18 )

## [1] 0.0045

mean(samp$b0 > samp$b0 + samp$b2 
      & samp$b0 > samp$b0 + samp$b3
      & samp$b0 > samp$b0 + samp$b4)

## [1] 0.00725

Estimate a central 95% credible interval for the (pretest-adjusted) difference between the average posttest score under BAU and the worse of the other 3 conditions (accounting for the fact that we’re not necessarily sure which one that is)

Homework 4

Kirk Vanacore

10/6/2021

Homework 4

Load/Setup STRAN

PART 1

stan model 1

Prepare data

Fit Part 1 Model

Save Part 1 Model

Evaluate Model 2

Part 2

stan model 2

Fit Part 2 Model

Evaluate Model 2

Extra credit:

What’s the posterior probability that BAU (assistments with delayed feedback) is worse than FH2T?

What’s the posterior probability BAU is worse than all the other conditions? (ie. Pr(BAU is worse than assistments & BAU is worse than FH2T & BAU is worse than dragonbox|data))

Estimate a central 95% credible interval for the (pretest-adjusted) difference between the average posttest score under BAU and the worse of the other 3 conditions (accounting for the fact that we’re not necessarily sure which one that is)