'Re: [R-sig-ME] Understanding variance components'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-sig-mixed-models
Subject:    Re: [R-sig-ME] Understanding variance components
From:       "Chen, Gang (NIH/NIMH) [C]" <gangchen () mail ! nih ! gov>
Date:       2017-01-22 21:48:31
Message-ID: 2DF23039AABC714B865A63F67141C0945A49EB6E () msgb06 ! nih ! gov
[Download RAW message or body]

Thanks René for the comment. I'm still puzzled by the fact that the variance \
decomposition cannot seem to be directly reconciled through the two models \
themselves, and I hope that someone can offer a better way to interpret this.

Gang
________________________________
From: René [bimonosom@gmail.com]
Sent: Thursday, January 19, 2017 10:55 AM
To: Chen, Gang (NIH/NIMH) [C]
Cc: Henrik Singmann; r-sig-mixed-models@r-project.org
Subject: Re: [R-sig-ME] Understanding variance components

Hey Gang, hey Henrik (!) :)

very insightful query, hope it is okay that I briefly join it, on the last question.
Variance is a result of division of squared deviations by sample size. In the \
complete sample, this means divided by 240; and in the gender samples by 120 each. \
Have a look on the reversed equations for obtaining the overall variance by combining \
gender groups, weighted by sample size:

(120 * 0.9944589   +   4.137316 * 120)/240 = overall variance (should be equal to)  = \
(0.9944589 + 4.137316)/2   acording to your equation gang left hand can be written \
as: 120 * (0.9944589   +   4.137316) / 240
which is:
(0.9944589   +   4.137316)/2

so nothing to worry, I think :)

Best wishes,
René

2017-01-19 16:14 GMT+01:00 Chen, Gang (NIH/NIMH) [C] \
<gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov>>: Nice, Henrik!

One thing we need to resolve, though, is this.

====================
The variances for model m1:

(1) intercept

summary(m1)$varcor$id[1]
[1] 2.565887

(2) residuals

attr(summary(m1)$varcor, "sc")^2
[1] 2.196212

====================
And the variances for model m2:

(1) female

summary(m2)$varcor$id[1]
[1] 0.9944589

(2) male

summary(m2)$varcor$id.1[1]
[1] 4.137316

(3) residuals

attr(summary(m2)$varcor, "sc")^2
[1] 2.196212
====================

It’s great to see that the residual variance matches between the two models m1 and \
m2. However, the intercept variance in m1, 2.565887, is not equal to the sum of \
female and male variances, 0.9944589 + 4.137316 = 5.131775. However, if we divide the \
total variance of female and male by 2, we have (0.9944589 + 4.137316)/2 = 2.565887. \
Why is that?

If we code the two groups as

obk.long$gender_F <- sqrt(2)*as.numeric(obk.long$gender == "F")
obk.long$gender_M <- sqrt(2)*as.numeric(obk.long$gender == "M”)

then we have the desired result,

m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long)
summary(m2)$varcor$id[1]+summary(m2)$varcor$id.1[1]
[1] 2.565888

Even though the variance part is reconciled, I cannot come up with a good explanation \
as to why this coding strategy is required. Any thought?

Thanks,
Gang

On Jan 19, 2017, at 6:13 AM, Henrik Singmann \
<singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch><mailto:singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch>>> \
wrote:

Hi Gang,

I have an idea which is based on the last example given on ?lmer:
## Fit sex-specific variances by constructing numeric dummy variables
...

I am not sure if this is entirely correct, but it looks good to me. If not, hopefully \
someone more knowledgeable will jump in.

## Original model without gender specific variance:
data(obk.long, package = "afex")

m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long)
summary(m1)$varcor
## Groups   Name        Std.Dev.
## id       (Intercept) 1.6018
## Residual             1.4820
REMLcrit(m1)
## [1] 911.1599

## to get gender specific vari8ances, we construct two dummy variables:
obk.long$gender_F <- as.numeric(obk.long$gender == "F")
obk.long$gender_M <- as.numeric(obk.long$gender == "M")
m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long)
summary(m2)$varcor
## Groups   Name     Std.Dev.
## id       gender_F 0.99723
## id.1     gender_M 2.03404
## Residual          1.48196
REMLcrit(m2)
## [1] 908.297

So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left \
two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), \
ranef(m1)$id) ##      gender_F   gender_M rep(NA, 16) (Intercept)
## 1   0.0000000 -3.0986753          NA  -3.0351426
## 2   0.0000000 -2.1328544          NA  -2.0891242
## 3   0.0000000  0.1207276          NA   0.1182523
## 4  -0.9806229  0.0000000          NA  -1.0642708
## 5  -0.3995130  0.0000000          NA  -0.4335918
## 6   0.0000000  2.6962499          NA   2.6409683
## 7   0.0000000  1.4084888          NA   1.3796103
## 8  -0.3995130  0.0000000          NA  -0.4335918
## 9  -0.6900680  0.0000000          NA  -0.7489313
## 10  0.0000000  0.4426679          NA   0.4335918
## 11  0.0000000 -1.1670336          NA  -1.1431057
## 12  0.0000000  1.7304291          NA   1.6949498
## 13  1.3438166  0.0000000          NA   1.4584452
## 14 -0.6900680  0.0000000          NA  -0.7489313
## 15  0.4721518  0.0000000          NA   0.5124267
## 16  1.3438166  0.0000000          NA   1.4584452

And, finally, the same for the fixed effects:
fixef(m1)
##       (Intercept)           genderM         phasepost
##      6.000000e+00      7.500000e-01     -6.250000e-01
##          phasepre genderM:phasepost  genderM:phasepre
##     -2.000000e+00     -1.243450e-15     -1.687539e-15
fixef(m2)
##      (Intercept)           genderM         phasepost
##     6.000000e+00      7.500000e-01     -6.250000e-01
##         phasepre genderM:phasepost  genderM:phasepre
##    -2.000000e+00      1.829648e-14      1.820766e-14

Hope that helps,
Henrik

On Jan 18, 2017, at 5:18 PM, Chen, Gang (NIH/NIMH) [C] \
<gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov><mailto:gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov>>> \
wrote:

Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I \
posted the question, I realized that my question was silly! Once I laid out the LME \
model equation, my original confusion was resolved.

Actually I meant to ask a slightly different question. Let me use the dataset \
embedded in your ‘afex’ package as an example:

data(obk.long, package = "afex”)

Suppose that my base model is

lmer(value ~ gender*phase+(1|id), data=obk.long)

Is there a way to specify a different variance for each gender in one model?

Thanks,
Gang

On Jan 13, 2017, at 12:06 PM, Henrik Singmann \
<singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch><mailto:singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch>>> \
wrote:

Hi Gang,

Sorry that I so am late to the party, but in case you are still interested I will \
reply (and, of course, for the archive).

The answer is basically given in the old faq:
http://glmm.wikidot.com/faq#toc27

(1|site/block) = (1|site)+(1|site:block)

Which is exactly what is given in your output. A random intercept for Worker and a \
random intercept for each worker:Machine interaction.

To answer your questions. The random intercepts do not have base or reference levels. \
They are increments or decrements to the overall intercept for each level of Worker \
or the Machine:Worker combination. The reported variance is the estimated variance of \
these increments, which is most likely unequal to the actual variance you would \
obtain by calculating it from the estimated increments, which are sometimes called \
BLUPs (I wonder if a better term for those exist).

Hope that helps,
Henrik

PS: Belated Happy New Year to everyone.

Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R:

library(lme4)
data(Machines,package="nlme")
mydata <- Machines[Machines$Machine!='C’,]

With the following model:

print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var")

I have the variance components as shown below:

Random effects:
Groups         Name        Variance
Machine:Worker (Intercept) 46.00
Worker         (Intercept) 13.84
Residual                    1.16

I have trouble understanding exactly what the first two components are: \
Machine:Worker and Worker? Specifically,

1) What is the variance for Worker: corresponding to the base (or reference) level of \
the factor Machine? If so, what is the base level: the first level in the dataset or \
alphabetically the first level (it happens to be the same in this particular \
dataset)?

2) What is the variance for Machine:Worker? Is it the variance for the second level \
of the factor Machine, or the extra variance relative to the variance for Worker?

Furthermore, for the model:

print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var")

what is the variance for Machine:Worker in the following result since there are 3 \
levels involved in the factor Machine?

Random effects:
Groups         Name        Variance
Machine:Worker (Intercept) 60.2972
Worker         (Intercept)  7.3959
Residual                    0.9246

Thanks,
Gang
_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org><mailto:R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org>> \
mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[prev in list] [next in list] [prev in thread] [next in thread]