Simulation-based approach
- R allows for the Monte-Carlo method, repeated simulations based on
assumptions to find solutions
- Here is a Monte-Carlo solution to t-test power, where I assume each
group follows a normal distribution
MC.ttest <-function(Params) {
TrueD<-Params[[1]]; alpha<-Params[[2]]
N1<-Params[[3]]; N2<-Params[[4]]
# Set distrobutions
Control<-rnorm(N1, mean = 0, sd = 1)
Exp<-rnorm(N2, mean = TrueD, sd = 1)
# Run t-est
calcP<-t.test(Exp,Control,
alternative = c("two.sided"),
paired = FALSE, var.equal = TRUE,
conf.level = 1-alpha)$p.value
result<-calcP
return(result)
}
# Paramaters
d = .8; alpha= .05; N1=20; N2=20; Sim=1000
# Run simulation
Pvalues<-replicate(Sim,MC.ttest(c(d,alpha,N1,N2)))
Sig<-ifelse(Pvalues <= alpha, 1, 0)
PowerMC<-sum(Sig)/Sim
Simulated Power = 0.718 which should be very close to a formula
solution 0.6934042
- We would have to build simulations like this for each and every
design we wanted to test and for mixed models, this is a lot of
programming!
Simr package
- Simr package makes it easier to do this kind of simulations
- The package is not well documented and its sort of buggy (but so far
its easiest to use)
- You can use it construct a simulation based on parameters (like
Westfall et al.), but it simulates the power or you can base it on a
previous model
Simulation from Parameters
- Let’s redo our experiment from above
- This is basically going to give an observed power
- Start by specifying the number of subjects (n = 30) and
items (n = 10)
library(simr)
Item <- as.factor(rep(1:10))
Subject <- as.factor(rep(1:30))
Condition1<-rep(-.5:.5)
# creates "frame" for our data
X <- expand.grid(Subject=Subject,Item=Item, Condition1=Condition1)
Subject
|
Item
|
Condition1
|
1
|
1
|
-0.5
|
2
|
1
|
-0.5
|
3
|
1
|
-0.5
|
4
|
1
|
-0.5
|
5
|
1
|
-0.5
|
6
|
1
|
-0.5
|
- Next, we need to specify (we will set the random correlation to be
zero for now)
- Fixed terms
- Intercept & Condition [intercept = 10.19, Fixed slope of C1 =
4.77] -Random terms (C1 is a slope [effects coded] -.5,.5)
- If we did this (1+C1|Subjects) our COV matrix would look like
this:
\[
\mathbf{COV: Subj} = \left[\begin{array}
{rrr}
Intercept & cov(IxSlope) \\
cov(IxSlope) & Slope \\
\end{array}\right]
\]
- For Subjects we will do (assume 0 correlation between
intercept/slope)
\[
\mathbf{Subject} = \left[\begin{array}
{rrr}
23.47 & 0 \\
0 & 3.98 \\
\end{array}\right]
\]
- For Items we will do (assume 0 correlation between
intercept/slope)
\[
\mathbf{Items} = \left[\begin{array}
{rrr}
2.197 & 0 \\
0 & 5.009 \\
\end{array}\right]
\]
- Setup our simulations to let the correlations vary around zero:
# fixed intercept and slope
b <- c(10.1912, 4.7773)
# random intercept and slope variance-covariance matrix
# For Subject
SubVC <-matrix(c(23.471,0,0,3.98), 2)
# For Items
ItemVC <- matrix(c(2.197,0,0,5.009), 2)
# Exrtact the residual sd
s <- 7.13^.5
- Next, we need to make a lmer object
- we need to feed in the fixed effects, random effects, residual, and
“frame” of the data
SimModel <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item),
fixef=b, VarCorr=list(SubVC,ItemVC), sigma=s, data=X)
summary(SimModel)
## Linear mixed model fit by REML ['lmerMod']
## Formula: DV ~ Condition1 + (Condition1 | Subject) + (Condition1 | Item)
## Data: X
##
## REML criterion at convergence: 3516.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.1451 -0.6925 0.0210 0.7147 3.9010
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 23.471 4.845
## Condition1 3.980 1.995 0.00
## Item (Intercept) 2.197 1.482
## Condition1 5.009 2.238 0.00
## Residual 7.130 2.670
## Number of obs: 600, groups: Subject, 30; Item, 10
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 10.1912 1.0070 10.121
## Condition1 4.7773 0.8253 5.789
##
## Correlation of Fixed Effects:
## (Intr)
## Condition1 0.000
- You will notice the SimModel looks like our real data
- This is the model it is going to parametrically simulate
- Here is an example of 1 dataset it will simulate given these
parameters
kable(head(getData(SimModel))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Subject
|
Item
|
Condition1
|
DV
|
1
|
1
|
-0.5
|
-12.437575
|
2
|
1
|
-0.5
|
-4.208957
|
3
|
1
|
-0.5
|
-12.357406
|
4
|
1
|
-0.5
|
16.627996
|
5
|
1
|
-0.5
|
-6.136707
|
6
|
1
|
-0.5
|
16.740027
|
Plot
- One simulation from simr package based on our parameters
Monte-Carlo Simulation
- What we want to do is simulate this 100 times to keep the
computational time short (but 1000 is the default)
- But we need to figure out what we want to test about our model and
how
- We want to test fixed or random effects? Let’s just worry about
fixed
- Which pvalues do we want to use?
- z, t, F scores (z,t,f), Likelihood ratio test (lr), Kenward-Roger
(kr), or Parametric bootstrap (pb)?
- I will tell it to give me the observed power for the fixed effect of
Condition using lr method (to be faster, but with an alpha .045, since
we have seen in the last few weeks that Likelihood ratio testing tends
to yield type I error slighly above .05)
SimPower1<-powerSim(SimModel,fixed("Condition1", "lr"),
nsim=100, alpha=.045, progress=FALSE)
SimPower1
## Power for predictor 'Condition1', (95% confidence interval):
## 100.0% (96.38, 100.0)
##
## Test: Likelihood ratio
## Effect size for Condition1 is 4.8
##
## Based on 100 simulations, (8 warnings, 0 errors)
## alpha = 0.045, nrow = 600
##
## Time elapsed: 0 h 0 m 22 s
- Obviously, we have high power, because this is Observed
Power
Using VPC over raw scores
- We can use our VPC measures we calculated and get the same
answer
b2 <- c(0, d.mixed) # fixed intercept and slope
SubVC2 <-matrix(c(.5616,0,0,.09527), 2)
ItemVC2 <- matrix(c(.05256,0,0,.11984), 2)
s2 <- (.1707)^.5 # residual sd
SimVPC <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item),
fixef=b2, VarCorr=list(SubVC2,ItemVC2), sigma=s2, data=X)
SimPower.VPC<-powerSim(SimVPC,fixed("Condition1", "lr"),
nsim=100, alpha=.045, progress=FALSE)
SimPower.VPC
## Power for predictor 'Condition1', (95% confidence interval):
## 100.0% (96.38, 100.0)
##
## Test: Likelihood ratio
## Effect size for Condition1 is 0.74
##
## Based on 100 simulations, (1 warning, 0 errors)
## alpha = 0.045, nrow = 600
##
## Time elapsed: 0 h 0 m 22 s
Block correlations in Model Fitting
- To block he correlations we need to write them out differently
b2 <- c(0, d.mixed) # fixed intercept and slope
SubVC2a <-.5616
SubVC2b <-.09527
ItemVC2a <-.05256
ItemVC2b <-.11984
s2 <- (.1707)^.5 # residual sd
SimVPCa <- makeLmer(DV ~ Condition1 + (Condition1||Subject)+(Condition1||Item),
fixef=b2, VarCorr=list(SubVC2a,SubVC2b,ItemVC2a,ItemVC2b), sigma=s2, data=X)
SimPower.VPCa<-powerSim(SimVPCa,fixed("Condition1", "lr"),
nsim=100, alpha=.045, progress=FALSE)
SimPower.VPCa
## Power for predictor 'Condition1', (95% confidence interval):
## 100.0% (96.38, 100.0)
##
## Test: Likelihood ratio
## Effect size for Condition1 is 0.74
##
## Based on 100 simulations, (1 warning, 0 errors)
## alpha = 0.045, nrow = 600
##
## Time elapsed: 0 h 0 m 17 s
Simulate Observed Power Directly
- You can simply call the model we analyzed and it will work
SimPower.Direct<-powerSim(MaxModel,fixed("Condition1", "lr"),
nsim=100, alpha=.045, progress=FALSE)
SimPower.Direct
## Power for predictor 'Condition1', (95% confidence interval):
## 100.0% (96.38, 100.0)
##
## Test: Likelihood ratio
## Effect size for Condition1 is 4.8
##
## Based on 100 simulations, (3 warnings, 0 errors)
## alpha = 0.045, nrow = 1200
##
## Time elapsed: 0 h 0 m 15 s
##
## nb: result might be an observed power calculation
- Since the program notices you are calling from real data its warning
you that this may be observed power.
Power Curve
- Lets say we need to figure out how many subjects or items we would
want to collect for a future study based on our pilot study?
- What we need to do is resimulate our experiments, but expand out the
number of subjects or items
- Basically we have to change the sample size (of items or subjects)
and run a monte-carlo simulation each time. So based on the number of
sample sizes you want to test
- This is very computationaly expensive (Slow)
- Take the second example from Brysbaert & Stevens, 2018
- Let’s first just do subjects
- n = 20, 40, 60, 80 [so our simulated data set must be built
with N = 80]
- 20 items
- alpha =.045 (lr testing)
- d = .112
Item <- as.factor(rep(1:20))
Subject <- as.factor(rep(1:80))
Condition1<-rep(-.5:.5)
X <- expand.grid(Subject=Subject,Item=Item, Condition1=Condition1)
b2 <- c(0, .112)
SubVC2 <-matrix(c(.368,0,0,.004), 2)
ItemVC2 <- matrix(c(.068,0,0,.001), 2)
s2 <- (.559)^.5 # residual sd
SimCurve <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item),
fixef=b2, VarCorr=list(SubVC2,ItemVC2), sigma=s2, data=X)
summary(SimCurve)
## Linear mixed model fit by REML ['lmerMod']
## Formula: DV ~ Condition1 + (Condition1 | Subject) + (Condition1 | Item)
## Data: X
##
## REML criterion at convergence: 7542.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0709 -0.6601 0.0064 0.6785 3.1029
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.368 0.60663
## Condition1 0.004 0.06325 0.00
## Item (Intercept) 0.068 0.26077
## Condition1 0.001 0.03162 0.00
## Residual 0.559 0.74766
## Number of obs: 3200, groups: Subject, 80; Item, 20
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.00000 0.09041 0.000
## Condition1 0.11200 0.02826 3.963
##
## Correlation of Fixed Effects:
## (Intr)
## Condition1 0.000
- Using the powerCurve function we can test across
subjects
SCurve1<-powerCurve(SimCurve, fixed("Condition1", "lr"),
along = "Subject",
breaks = c(20,40,60,80),
nsim=100,alpha=.045, progress=FALSE)
plot(SCurve1)
- Test across items, which will calculate for 80 subects, so examine
less items
SCurve2<-powerCurve(SimCurve, fixed("Condition1", "lr"),
along = "Item",
breaks = c(5,10,15),
nsim=100,alpha=.045, progress=FALSE)
plot(SCurve2)
- Using these same parameters (N = 80, Items = 20) Westfall website
yielded a power of .894. The simulation suggested power of 95%, with a
CI = [88.72 - 98.36]. So they seem to agree in this case (but they did
not agree with our first example, but that effect size was huge)
Simulate higher order interactions
- You can simulate more complex models and more complex terms, but you
must map out the matrix more carefully
- You have to estimate fixed slopes (C1 + C2 + C1:C1)
- You have to predefine all your random effects
- if we assume
- For simplicity we can assume zero random correlations using
diag
command.
- subject diag(intercept, C1 slope, C2 slope, C1:C2 slope)
Item <- as.factor(rep(1:20))
Subject <- as.factor(rep(1:20))
C1<-rep(-.5:.5)
C2<-rep(-.5:.5)
# creates "frame" for our data
X <- expand.grid(Subject=Subject,Item=Item, C1=C1,C2=C2)
b3 <- c(0, .05,-.05,.1) # fixed intercept and slope
SubVC3 <-diag(c(.35,.005,.005,.005))
ItemVC3 <-diag(c(.1,.005,.005,.005)) # random intercept and slope variance-covariance matrix
s3 <- (1-(sum(SubVC3)+sum(ItemVC3)))^.5 # residual sd
SimInter <- makeLmer(DV ~ C1*C2 + (C1*C2|Subject)+(C1*C2|Item),
fixef=b3, VarCorr=list(SubVC3,ItemVC3), sigma=s3, data=X)
summary(SimInter)
SimPower.Inter<-powerSim(SimInter,fixed("C1:C2", "lr"),
nsim=100, alpha=.045, progress=FALSE)
SimPower.Inter
References
Brysbaert, M., & Stevens, M. (2018). Power analysis and effect
size in mixed effects models: A tutorial. Journal of Cognition,
1(1).
Fischer, P., Krueger, J. I., Greitemeyer, T., Vogrincic, C.,
Kastenm?ller, A., Frey, D., … & Kainbacher, M. (2011). The
bystander-effect: a meta-analytic review on bystander intervention in
dangerous and non-dangerous emergencies. Psychological
bulletin, 137(4), 517.
Green, P., & MacLeod, C. J. (2016). SIMR: an R package for power
analysis of generalized linear mixed models by simulation. Methods
in Ecology and Evolution, 7(4), 493-498.
Hoenig, J.M. & Heisey, D.M. (2001) The abuse of power: the
pervasive fallacy of power calculations for data analysis. The
American Statistician, 55, 19-24.
Judd, C. M., Westfall, J., & Kenny, D. A. (2017). Experiments
with more than one random factor: Designs, analytic models, and
statistical power. Annual Review of Psychology, 68,
601-625.
Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical
power and optimal design in experiments in which samples of participants
respond to samples of stimuli. Journal of Experimental Psychology:
General, 143(5), 2020-2045.
---
title: 'Power Analysis'
output:
  html_document:
    code_download: yes
    fontsize: 8pt
    highlight: textmate
    number_sections: no
    theme: flatly
    toc: yes
    toc_float:
      collapsed: no
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(cache = TRUE)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warning =  FALSE)
knitr::opts_chunk$set(fig.width=3.25)
knitr::opts_chunk$set(fig.height=2.75)
knitr::opts_chunk$set(fig.align='center') 
knitr::opts_chunk$set(results='hold') 
```

```{r, echo=FALSE, warning=FALSE}
library(knitr)
library(kableExtra)
```

\pagebreak

# A Priori Power
- Power = 1 - Type II error (missing an effect when the effect is present). Basically, the probability that our test will reject a false null hypothesis.
- Cohen (1962/1988) said we need to know the **effect size** *in the population* and we need to set an **alpha level** to set a criteria for significance to estimate the sample size needed to achieve a specific power level
    - Effect size can be defined as the distance between our experimental and control distributions: Cohen's $d = \frac{M-\mu}{\sigma}$ [or the percentage of variance explained, $\eta^2 = \frac{d^2}{d^2+4}$] 



![](Mixed/PowerPlot.png){ width=50% }
\


- Power increases as a function of sample size and effect size and lowering the alpha (.05 to .1)
- Below is power for *between-subject* t-test across different sample sizes for different effect sizes

```{r, , echo=FALSE, out.width='.49\\linewidth', fig.width=2.25, fig.height=2.5,fig.show='hold',fig.align='center'}
library(pwr)
library(ggplot2) 
theme_set(theme_bw())
Ind.Find.Power<-function(N,dset){
    pwr.t.test(n = N, d = dset, sig.level = 0.05, power = NULL, 
    type = c("two.sample"))$power
}

N<-seq(5,200,1)
PowerRange.d.2<-mapply(Ind.Find.Power,N,.2)
PowerRange.d.4<-mapply(Ind.Find.Power,N,.4)
PowerRange.d.8<-mapply(Ind.Find.Power,N,.8)
qplot(N,PowerRange.d.2, geom=c("line"),
      main="d = .2, a = .05, BS", xlab="Sample Size Per Group", ylab="Power", ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')

qplot(N,PowerRange.d.4, geom=c("line"),
      main="d = .4, a = .05, BS", xlab="Sample Size Per Group", ylab="Power", ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')

qplot(N,PowerRange.d.8, geom=c("line"),
      main="d = .8, a = .05, BS", xlab="Sample Size Per Group", ylab="Power", ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')
```

- Below is power for *within-subject* t-test across different sample sizes for different effect sizes

```{r, echo=FALSE,out.width='.49\\linewidth', fig.width=2.25, fig.height=2.50,fig.show='hold',fig.align='center'}
Paired.Find.Power<-function(N,dset){
    pwr.t.test(n = N, d = dset, sig.level = 0.05, power = NULL, 
    type = c("paired"))$power
}

N<-seq(5,200,1)
PowerRange.d.2.p<-mapply(Paired.Find.Power,N,.2)
PowerRange.d.4.p<-mapply(Paired.Find.Power,N,.4)
PowerRange.d.8.p<-mapply(Paired.Find.Power,N,.8)
qplot(N,PowerRange.d.2.p, geom=c("line"),
      main="d = .2, a = .05, WS", xlab="Sample Size", ylab="Power", ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')
qplot(N,PowerRange.d.4.p, geom=c("line"),
      main="d = .4, a = .05, WS", xlab="Sample Size", ylab="Power",ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')
qplot(N,PowerRange.d.8.p, geom=c("line"),
      main="d = .8, a = .05, WS", xlab="Sample Size", ylab="Power", ylim=c(0,1))+
  geom_hline(yintercept=.8, color='red')
```

- Below is we solve for the sample size needed if we want to achieve an a priori  power of .80 for a *within-subject* & *between-subject* t-test across different effect sizes

```{r, echo=FALSE, out.width='.49\\linewidth', fig.width=3.00, fig.height=2.5,fig.show='hold',fig.align='center'}
Ind.Find.N<-function(dset){
    pwr.t.test(n = NULL, d = dset, sig.level = 0.05, power = .8, 
    type = c("two.sample"))$n*2
}

Paired.Find.N<-function(dset){
    pwr.t.test(n = NULL, d = dset, sig.level = 0.05, power = .8, 
    type = c("paired"))$n
}

drange<-seq(.2,1,.01)
SampleNeeded<-sapply(drange,Ind.Find.N)
SampleNeededPaired<-sapply(drange,Paired.Find.N)

qplot(drange,SampleNeeded, geom=c("line"),
      main="Power = .8, a = .05, BS", xlab="Cohen's d", ylab="Total Sample Size Needed")
qplot(drange,SampleNeededPaired, geom=c("line"),
      main="Power = .8, a = .05, WS", xlab="Cohen's d", ylab="Total Sample Size Needed")

```


## A Priori Power Problems
- Is it possible to know the population effect size?
    - Might it all really depend on contextual factors (where you collect your data, who collected data, how questions are asked, etc.)?
    - In other words, population effect size will vary as a function of how to conduct your study
- Often we estimated a priori power two ways
    - A) Guess that was medium or small
    - B) Guess it from prior papers or pilot studies (Observed effect sizes)
        - Pilot studies are designed to be **under-powered**  

# Observed Power/Effect Sizes
- This is the process of taking someone else study (or a small sample pilot study you conduct), using the effect size that is found using it for your power analysis
    - A) Using published literature has the problem of only publishing significant effects (which inflates the effect size estimate of the population)
    - B) Using pilot studies with small samples can inflate the effect sizes because in small samples you get a deflated estimate of the standard deviation, which inflates your effect size (see Anderson, Kelley, & Maxwell, 2017 for review on both these issues)

## Observed Power Formulas
- You go to an old published paper on the bystander effect (Campbell, 1974) and you find they have an effect size of hedge's g = 1.798 (which is a corrected Cohen's d) with n = 12 per group (between-subject t-test with an a = .05)

```{r}
ObPower<-pwr.t.test(n = 12, d = 1.798, sig.level = 0.05, 
           power = NULL, type = c("two.sample"))$power
```

- The power you would estimate from their study given these parameters would be *observed power* =  `r round(ObPower,3)`. This is *after the fact* (**post-hoc**).  It is terrible estimate of what their *a priori* power was (Hoenig & Heisey, 2001). It turns out the bystander effect has an effect size of *g* = .35 (Meta-Analysis; Fischer et al, 2011). Given this value what was their a priori power? 

```{r}
aprioriPower<-pwr.t.test(n = 12, d = .35, sig.level = 0.05, 
           power = NULL, type = c("two.sample"))$power
```

- Given N = 12 per group they would have an *a priori power* =  `r round(aprioriPower,3)`.
- How did they get such a large number? 

## Observed Power Simulations
- Using Monte-Carlo simulation methods, here are simulations results of what happens to your effect sizes if you use either pilot (under-powered studies: say Power of .12) or from published literature (which often publishes inflated results, see Ioannidis, 2005)
- The line represented the *median observed effect size - pop effect size / pop effect size* [1K simulations]

![Inflation Rate](Mixed/InflateRate.png){ width=45% }
\

- This results in the problem that you are estimating the sample size too small for your population effect size and thus your replicability rate is lowered


![Replicability rate](Mixed/R.Plot.G.png){ width=40% }
\

- So pilot studies don't look that bad, right? You see the median of 1000 simulations. The problem is look at the range of possible effect sizes. 

```{r, echo=FALSE}
Dsim<-function(Params) {
  TrueD<-Params[[1]]; alpha<-Params[[2]]
  N1<-Params[[3]]; N2<-Params[[4]]
  # Set distrobutions
  Control<-rnorm(N1, mean = 0, sd = 1)
  Exp<-rnorm(N2, mean = TrueD, sd = 1)
  # Run t-est
 Observed.d<-abs((mean(Exp)-mean(Control))/sd(Control))
  return(Observed.d)
}
# Paramaters
d = .35; alpha= .05; N1=12; N2=12; Sim=10000
# Run simulation
Observed.d<-replicate(Sim,Dsim(c(d,alpha,N1,N2)))
qplot(Observed.d, geom = 'density', fill= I("blue"),
      main="Range of Observed d", xlab="Observed Cohen's d", ylab="Density")
```
There is a long tail there which say you can easily think you have a big effect.

```{r, echo=FALSE}
library(caTools)
Dconvert <- density(Observed.d, kernel='cosine') # returns the density data 
aboveG<-round(trapz(Dconvert$x[Dconvert$x >= .35],Dconvert$y[Dconvert$x >= .35])*100,3)
aboveL<-round(trapz(Dconvert$x[Dconvert$x >= .8],Dconvert$y[Dconvert$x >= .8])*100,3)
aboveObd<-round(trapz(Dconvert$x[Dconvert$x >= 1.798],Dconvert$y[Dconvert$x >= 1.798])*100,3)
```

- Given N = 12 per group and assuming no hacking
    - they would had about `r aboveG`% to find a *d* >= .35 (Population).
    - they would had about `r aboveL`% to find a *d* >= .8 (Large).
    - they would had about `r aboveObd`% to find a *d* >= 1.798 (Observed).

- However, as you saw with the replicability rate, you are far better piloting than assuming an effect size from published papers (unless it's a meta-analysis). 

- Take away message is it is hard to know what is real and what is not (repeated replications are needed or meta-analysis)
- In practice what to do?
    - Formula-based approach using estimated effect size (keeping in mind publication bias)
    - Simulation-based approaches 
    
# Formula-based approach 
-  Shiny App: Westfall, Judd & Kenny, 2014 & Judd, Westfall & Kenny, 2017 

1) Estimating a Cohen's d 
$$d = \frac{M_d}{\sqrt(\sigma_{Intercept_{Subj}}^2+\sigma_{Intercept_{item}}^2+\sigma_{Slope_{Subj}}^2+\sigma_{Slope_{Item}}^2+\sigma_{Resid}^2)}   $$

2) Converting your variance components into Variance Partitioning Coefficients (VPC), which just the relative the percentage of component (Variance Term/ Sum Random Variance) 

3) Use his shiny app (or download his code and extract the functions)
- One Crossed Factor: https://jakewestfall.shinyapps.io/crossedpower/ 
- Two Crossed/Nested Factors: https://jakewestfall.shinyapps.io/two_factor_power/


## Example
We will use the simulated data from last week (a full crossed study with 30 subjects and 10 items), but look at only one of the factors on a maximal model. 
- We will use the simulated data from last week
: Simplified model for now: `(1+C1||Subject) + (1+C1||Item)`
- Conditions were effects coded $(-.5, .5)$
- [Download Data](/Mixed/Sim8.csv)

```{r, ech=FALSE}
DataSim8<-read.csv("Mixed/Sim8.csv")
DataSim8$C1<-factor(DataSim8$Condition1)
DataSim8$C2<-as.factor(DataSim8$Condition2)
DataSim8$Item<-as.factor(DataSim8$Item)
DataSim8$Subject<-as.factor(DataSim8$Subject)
```

- Fix the maximal model (but I will block the correlations) for a full crossed design [again we will just ignore Condition 2 for now]

```{r}
library(lme4)
MaxModel<-lmer(DV_SS_RSlope_SSNoise_Items ~ Condition1
              +(1+Condition1||Subject)
              +(1+Condition1||Item),
              data=DataSim8, REML=FALSE)
summary(MaxModel, correlation=FALSE)
```

- We will use the broom package to extract the random effects we need

```{r}
library(broom)
RandomEffect<-tidy(MaxModel, effects = c("ran_pars"), scale="vcov")
```

```{r, echo=FALSE}
kable(RandomEffect) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive", full_width = F))
```

### Generate the estimated *d*

```{r}
Md<-summary(MaxModel)$coef[2]
SDPart<-sqrt(sum(RandomEffect$estimate))
d.mixed = Md / SDPart
```

- We get a *d* = `r round(d.mixed,3)`

### Generate the VPC
```{r}
RandomEffect$VPC<-RandomEffect$estimate/sum(RandomEffect$estimate)
```

```{r, echo=FALSE}
kable(RandomEffect) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

- Using the website, https://jakewestfall.shinyapps.io/crossedpower/ 
    - we find a power of .788 based on these parameters
    - assuming our *d* is correct, the website lets us solve for sample size, so at power .80, it says we need 36 subjects and 10 items  

## Limitations
- Only works for studies where factors have 2 levels, and we meet all the other assumptions (see Brysbaert & Stevens, 2018)
- We are not sure how this method works when you follow parsimonious modeling (as we did last week). We saw last 2 weeks that over specified models produce uninterpretable random effects and also deflate power

# Simulation-based approach
- R allows for the Monte-Carlo method, repeated simulations based on assumptions to find solutions
- Here is a Monte-Carlo solution to t-test power, where I assume each group follows a normal distribution

```{r}
MC.ttest <-function(Params) {
  TrueD<-Params[[1]]; alpha<-Params[[2]]
  N1<-Params[[3]]; N2<-Params[[4]]
  # Set distrobutions
  Control<-rnorm(N1, mean = 0, sd = 1)
  Exp<-rnorm(N2, mean = TrueD, sd = 1)
  # Run t-est
  calcP<-t.test(Exp,Control,
                alternative = c("two.sided"),
                paired = FALSE, var.equal = TRUE,
                conf.level = 1-alpha)$p.value
   result<-calcP
  return(result)
}
# Paramaters
d = .8; alpha= .05; N1=20; N2=20; Sim=1000
# Run simulation
Pvalues<-replicate(Sim,MC.ttest(c(d,alpha,N1,N2)))
Sig<-ifelse(Pvalues <= alpha, 1, 0)
PowerMC<-sum(Sig)/Sim            
```

```{r, echo=FALSE}
PowerForm<-pwr.t.test(n = N1, d = d, sig.level = 0.05, 
           power = NULL, type = c("two.sample"))$power
```

Simulated Power = `r PowerMC` which should be very close to a formula solution `r PowerForm` 

- We would have to build simulations like this for each and every design we wanted to test and for mixed models, this is a lot of programming! 
 
# Simr package
- Simr package makes it easier to do this kind of simulations 
    - The package is not well documented and its sort of buggy (but so far its easiest to use)
- You can use it construct a simulation based on parameters (like Westfall et al.), but it simulates the power or you can base it on a previous model

## Simulation from Parameters
- Let's redo our experiment from above
- This is basically going to give an *observed power*
- Start by specifying the number of subjects (*n* = 30) and items (*n* = 10)

```{r}
library(simr)
Item <- as.factor(rep(1:10))
Subject <- as.factor(rep(1:30))
Condition1<-rep(-.5:.5)
# creates "frame" for our data
X <- expand.grid(Subject=Subject,Item=Item, Condition1=Condition1)
```

```{r, echo=FALSE}
kable(head(X)) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```


- Next, we need to specify (we will set the random correlation to be zero for now)
    - Fixed terms
        - Intercept & Condition  [intercept = 10.19, Fixed slope of C1 = 4.77]
    -Random terms (C1 is a slope [effects coded] -.5,.5)    
        - If we did this (1+C1|Subjects) our COV matrix would look like this: 

$$
\mathbf{COV: Subj} = \left[\begin{array}
          {rrr}
          Intercept & cov(IxSlope) \\
          cov(IxSlope) & Slope  \\
          \end{array}\right]
$$

- For Subjects we will do (assume 0 correlation between intercept/slope)

$$
\mathbf{Subject} = \left[\begin{array}
          {rrr}
          23.47 & 0 \\
          0 & 3.98  \\
          \end{array}\right]
$$

- For Items we will do (assume 0 correlation between intercept/slope)

$$
\mathbf{Items} = \left[\begin{array}
          {rrr}
          2.197 & 0 \\
          0 & 5.009  \\
          \end{array}\right]
$$    
        
- Setup our simulations to let the correlations vary around zero: 

```{r}
# fixed intercept and slope
b <- c(10.1912, 4.7773) 
# random intercept and slope variance-covariance matrix
# For Subject
SubVC   <-matrix(c(23.471,0,0,3.98), 2)
# For Items
ItemVC <- matrix(c(2.197,0,0,5.009), 2) 
# Exrtact the residual sd
s <- 7.13^.5 
```

- Next, we need to make a lmer object
- we need to feed in the fixed effects, random effects, residual, and "frame" of the data

```{r}
SimModel <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item), 
                   fixef=b, VarCorr=list(SubVC,ItemVC), sigma=s, data=X)
summary(SimModel)
```

- You will notice the SimModel looks like our real data
- This is the model it is going to parametrically simulate
- Here is an example of 1 dataset it will simulate given these parameters

```{r}
kable(head(getData(SimModel))) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

### Plot
- One simulation from simr package based on our parameters

```{r, echo=FALSE, fig.width=8.0, fig.height=3.75,fig.show='hold',fig.align='center'}
library(ggplot2)

Sim1<-getData(SimModel)
Sim1$Subject<-as.factor(Sim1$Subject)
Sim1$Item<-as.factor(Sim1$Item)
Sim1$Condition1<-as.factor(Sim1$Condition1)

theme_set(theme_bw(base_size = 12, base_family = "")) 
bySS.Box <-ggplot(data = Sim1, aes(x = Subject, y=DV))+
  facet_grid(Condition1~.)+
  geom_violin(aes(fill=Subject, color=Subject))+
  geom_boxplot(width=.1)+
  ylab("Response")+xlab("Subject")+
  ggtitle("By Subject") +
  theme(legend.position = "none")
bySS.Box

byItem.Box <-ggplot(data = Sim1, aes(x = Item, y=DV))+
  facet_grid(Condition1~.)+
  geom_violin(aes(fill=Item))+
  geom_boxplot(width=.1)+
  ylab("Response")+xlab("Item")+
  ggtitle("By Item") +
  theme(legend.position = "none")
byItem.Box
```

## Monte-Carlo Simulation 
- What we want to do is simulate this 100 times to keep the computational time short (but 1000 is the default)
- But we need to figure out what we want to test about our model and how
    - We want to test fixed or random effects? Let's just worry about fixed
    - Which pvalues do we want to use?  
        - z, t, F scores (z,t,f), Likelihood ratio test (lr), Kenward-Roger (kr), or Parametric bootstrap (pb)?
- I will tell it to give me the observed power for the fixed effect of Condition using lr method (to be faster, but with an alpha .045, since we have seen in the last few weeks that Likelihood ratio testing tends to yield type I error slighly above .05)  

```{r}
SimPower1<-powerSim(SimModel,fixed("Condition1", "lr"),
                    nsim=100, alpha=.045, progress=FALSE)
SimPower1
```

- Obviously, we have high power, because this is *Observed Power*

## Using VPC over raw scores
- We can use our VPC measures we calculated and get the same answer

```{r}
b2 <- c(0, d.mixed) # fixed intercept and slope
SubVC2   <-matrix(c(.5616,0,0,.09527), 2)
ItemVC2 <- matrix(c(.05256,0,0,.11984), 2) 
s2 <- (.1707)^.5 # residual sd 
SimVPC <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item), 
                   fixef=b2, VarCorr=list(SubVC2,ItemVC2), sigma=s2, data=X)
SimPower.VPC<-powerSim(SimVPC,fixed("Condition1", "lr"),
                       nsim=100, alpha=.045, progress=FALSE)
SimPower.VPC
```

### Block correlations in Model Fitting
- To block he correlations we need to write them out differently

```{r}
b2 <- c(0, d.mixed) # fixed intercept and slope
SubVC2a   <-.5616
SubVC2b   <-.09527
ItemVC2a  <-.05256
ItemVC2b  <-.11984
s2 <- (.1707)^.5 # residual sd 
SimVPCa <- makeLmer(DV ~ Condition1 + (Condition1||Subject)+(Condition1||Item), 
                   fixef=b2, VarCorr=list(SubVC2a,SubVC2b,ItemVC2a,ItemVC2b), sigma=s2, data=X)
SimPower.VPCa<-powerSim(SimVPCa,fixed("Condition1", "lr"),
                       nsim=100, alpha=.045, progress=FALSE)
SimPower.VPCa
```

## Simulate Observed Power Directly
- You can simply call the model we analyzed and it will work

```{r}
SimPower.Direct<-powerSim(MaxModel,fixed("Condition1", "lr"), 
                          nsim=100, alpha=.045, progress=FALSE)
SimPower.Direct
```

- Since the program notices you are calling from real data its warning you that this may be observed power.

## Power Curve
- Lets say we need to figure out how many subjects or items we would want to collect for a future study based on our pilot study?
    - What we need to do is resimulate our experiments, but expand out the number of subjects or items
    - Basically we have to change the sample size (of items or subjects) and run a monte-carlo simulation each time. So based on the number of sample sizes you want to test 
        - This is very computationaly expensive (Slow)

- Take the second example from Brysbaert & Stevens, 2018      
- Let's first just do subjects 
    - *n* =  20, 40, 60, 80  [so our simulated data set must be built with N = 80]
    - 20 items
    - alpha =.045 (lr testing) 
    - *d* = .112

```{r}
Item <- as.factor(rep(1:20))
Subject <- as.factor(rep(1:80))
Condition1<-rep(-.5:.5)
X <- expand.grid(Subject=Subject,Item=Item, Condition1=Condition1)
b2 <- c(0, .112) 
SubVC2   <-matrix(c(.368,0,0,.004), 2)
ItemVC2 <- matrix(c(.068,0,0,.001), 2) 
s2 <- (.559)^.5 # residual sd 
SimCurve <- makeLmer(DV ~ Condition1 + (Condition1|Subject)+(Condition1|Item), 
                   fixef=b2, VarCorr=list(SubVC2,ItemVC2), sigma=s2, data=X)
summary(SimCurve)
```

- Using the *powerCurve* function we can test across subjects 

```{r}
SCurve1<-powerCurve(SimCurve, fixed("Condition1", "lr"),
                    along = "Subject",
                    breaks = c(20,40,60,80),
                    nsim=100,alpha=.045, progress=FALSE)
plot(SCurve1)
```

- Test across items, which will calculate for 80 subects, so examine less items

```{r}
SCurve2<-powerCurve(SimCurve, fixed("Condition1", "lr"),
                    along = "Item",
                    breaks = c(5,10,15),
                    nsim=100,alpha=.045, progress=FALSE)
plot(SCurve2)
```


- Using these same parameters (N = 80, Items = 20) Westfall website yielded a power of .894. The simulation suggested power of 95%, with a CI = [88.72 - 98.36]. So they seem to agree in this case (but they did not agree with our first example, but that effect size was huge)

## Simulate higher order interactions
- You can simulate more complex models and more complex terms, but you must map out the matrix more carefully
- You have to estimate fixed slopes (C1 + C2 + C1:C1)
- You have to predefine all your random effects
- if we assume 
    - For simplicity we can assume zero random correlations using `diag` command. 
        - subject diag(intercept, C1 slope, C2 slope, C1:C2 slope) 

```{r, eval=FALSE}
Item <- as.factor(rep(1:20))
Subject <- as.factor(rep(1:20))
C1<-rep(-.5:.5)
C2<-rep(-.5:.5)
# creates "frame" for our data
X <- expand.grid(Subject=Subject,Item=Item, C1=C1,C2=C2)

b3 <- c(0, .05,-.05,.1) # fixed intercept and slope
SubVC3  <-diag(c(.35,.005,.005,.005))
ItemVC3 <-diag(c(.1,.005,.005,.005)) # random intercept and slope variance-covariance matrix
s3 <- (1-(sum(SubVC3)+sum(ItemVC3)))^.5 # residual sd 
SimInter <- makeLmer(DV ~ C1*C2 + (C1*C2|Subject)+(C1*C2|Item), 
                   fixef=b3, VarCorr=list(SubVC3,ItemVC3), sigma=s3, data=X)
summary(SimInter)

SimPower.Inter<-powerSim(SimInter,fixed("C1:C2", "lr"),
                       nsim=100, alpha=.045, progress=FALSE)
SimPower.Inter
```

# Notes
- Learning the simr package is much easier than learning to construct your own function to conduct power analysis for mixed models
- There are many parameters to estimate and Brysbaert & Stevens (2018) suggest working from pilot data to estimate these parameters before moving forward. Otherwise we have know way to guess what is going on.  There is currently no work that I know of that shows how pilot studies using mixed models will predict future effect size rates/power. 
- Merging Westfalls et al.'s method to get VPC and estimate d with this monte-carlo approach seems useful. 


# References

Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. *Journal of Cognition*, 1(1).

Fischer, P., Krueger, J. I., Greitemeyer, T., Vogrincic, C., Kastenm?ller, A., Frey, D., ... & Kainbacher, M. (2011). The bystander-effect: a meta-analytic review on bystander intervention in dangerous and non-dangerous emergencies. *Psychological bulletin*, 137(4), 517.

Green, P., & MacLeod, C. J. (2016). SIMR: an R package for power analysis of generalized linear mixed models by simulation. *Methods in Ecology and Evolution*, 7(4), 493-498.

Hoenig, J.M. & Heisey, D.M. (2001) The abuse of power: the pervasive fallacy of power calculations for data analysis. *The American Statistician*, 55, 19-24.

Judd, C. M., Westfall, J., & Kenny, D. A. (2017). Experiments with more than one random factor: Designs,
analytic models, and statistical power. *Annual Review of Psychology*, 68, 601-625. 

Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in
which samples of participants respond to samples of stimuli. *Journal of Experimental Psychology: General*,
143(5), 2020-2045.

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-90415160-1', 'auto');
  ga('send', 'pageview');

</script>
