[prev in list] [next in list] [prev in thread] [next in thread]
List: r-sig-mixed-models
Subject: Re: [R-sig-ME] Understanding variance components
From: "Chen, Gang (NIH/NIMH) [C]" <gangchen () mail ! nih ! gov>
Date: 2017-01-22 21:48:31
Message-ID: 2DF23039AABC714B865A63F67141C0945A49EB6E () msgb06 ! nih ! gov
[Download RAW message or body]
Thanks René for the comment. I'm still puzzled by the fact that the variance \
decomposition cannot seem to be directly reconciled through the two models \
themselves, and I hope that someone can offer a better way to interpret this.
Gang
________________________________
From: René [bimonosom@gmail.com]
Sent: Thursday, January 19, 2017 10:55 AM
To: Chen, Gang (NIH/NIMH) [C]
Cc: Henrik Singmann; r-sig-mixed-models@r-project.org
Subject: Re: [R-sig-ME] Understanding variance components
Hey Gang, hey Henrik (!) :)
very insightful query, hope it is okay that I briefly join it, on the last question.
Variance is a result of division of squared deviations by sample size. In the \
complete sample, this means divided by 240; and in the gender samples by 120 each. \
Have a look on the reversed equations for obtaining the overall variance by combining \
gender groups, weighted by sample size:
(120 * 0.9944589 + 4.137316 * 120)/240 = overall variance (should be equal to) = \
(0.9944589 + 4.137316)/2 acording to your equation gang left hand can be written \
as: 120 * (0.9944589 + 4.137316) / 240
which is:
(0.9944589 + 4.137316)/2
so nothing to worry, I think :)
Best wishes,
René
2017-01-19 16:14 GMT+01:00 Chen, Gang (NIH/NIMH) [C] \
<gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov>>: Nice, Henrik!
One thing we need to resolve, though, is this.
====================
The variances for model m1:
(1) intercept
summary(m1)$varcor$id[1]
[1] 2.565887
(2) residuals
attr(summary(m1)$varcor, "sc")^2
[1] 2.196212
====================
And the variances for model m2:
(1) female
summary(m2)$varcor$id[1]
[1] 0.9944589
(2) male
summary(m2)$varcor$id.1[1]
[1] 4.137316
(3) residuals
attr(summary(m2)$varcor, "sc")^2
[1] 2.196212
====================
It’s great to see that the residual variance matches between the two models m1 and \
m2. However, the intercept variance in m1, 2.565887, is not equal to the sum of \
female and male variances, 0.9944589 + 4.137316 = 5.131775. However, if we divide the \
total variance of female and male by 2, we have (0.9944589 + 4.137316)/2 = 2.565887. \
Why is that?
If we code the two groups as
obk.long$gender_F <- sqrt(2)*as.numeric(obk.long$gender == "F")
obk.long$gender_M <- sqrt(2)*as.numeric(obk.long$gender == "M”)
then we have the desired result,
m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long)
summary(m2)$varcor$id[1]+summary(m2)$varcor$id.1[1]
[1] 2.565888
Even though the variance part is reconciled, I cannot come up with a good explanation \
as to why this coding strategy is required. Any thought?
Thanks,
Gang
On Jan 19, 2017, at 6:13 AM, Henrik Singmann \
<singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch><mailto:singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch>>> \
wrote:
Hi Gang,
I have an idea which is based on the last example given on ?lmer:
## Fit sex-specific variances by constructing numeric dummy variables
...
I am not sure if this is entirely correct, but it looks good to me. If not, hopefully \
someone more knowledgeable will jump in.
## Original model without gender specific variance:
data(obk.long, package = "afex")
m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long)
summary(m1)$varcor
## Groups Name Std.Dev.
## id (Intercept) 1.6018
## Residual 1.4820
REMLcrit(m1)
## [1] 911.1599
## to get gender specific vari8ances, we construct two dummy variables:
obk.long$gender_F <- as.numeric(obk.long$gender == "F")
obk.long$gender_M <- as.numeric(obk.long$gender == "M")
m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long)
summary(m2)$varcor
## Groups Name Std.Dev.
## id gender_F 0.99723
## id.1 gender_M 2.03404
## Residual 1.48196
REMLcrit(m2)
## [1] 908.297
So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left \
two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), \
ranef(m1)$id) ## gender_F gender_M rep(NA, 16) (Intercept)
## 1 0.0000000 -3.0986753 NA -3.0351426
## 2 0.0000000 -2.1328544 NA -2.0891242
## 3 0.0000000 0.1207276 NA 0.1182523
## 4 -0.9806229 0.0000000 NA -1.0642708
## 5 -0.3995130 0.0000000 NA -0.4335918
## 6 0.0000000 2.6962499 NA 2.6409683
## 7 0.0000000 1.4084888 NA 1.3796103
## 8 -0.3995130 0.0000000 NA -0.4335918
## 9 -0.6900680 0.0000000 NA -0.7489313
## 10 0.0000000 0.4426679 NA 0.4335918
## 11 0.0000000 -1.1670336 NA -1.1431057
## 12 0.0000000 1.7304291 NA 1.6949498
## 13 1.3438166 0.0000000 NA 1.4584452
## 14 -0.6900680 0.0000000 NA -0.7489313
## 15 0.4721518 0.0000000 NA 0.5124267
## 16 1.3438166 0.0000000 NA 1.4584452
And, finally, the same for the fixed effects:
fixef(m1)
## (Intercept) genderM phasepost
## 6.000000e+00 7.500000e-01 -6.250000e-01
## phasepre genderM:phasepost genderM:phasepre
## -2.000000e+00 -1.243450e-15 -1.687539e-15
fixef(m2)
## (Intercept) genderM phasepost
## 6.000000e+00 7.500000e-01 -6.250000e-01
## phasepre genderM:phasepost genderM:phasepre
## -2.000000e+00 1.829648e-14 1.820766e-14
Hope that helps,
Henrik
On Jan 18, 2017, at 5:18 PM, Chen, Gang (NIH/NIMH) [C] \
<gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov><mailto:gangchen@mail.nih.gov<mailto:gangchen@mail.nih.gov>>> \
wrote:
Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I \
posted the question, I realized that my question was silly! Once I laid out the LME \
model equation, my original confusion was resolved.
Actually I meant to ask a slightly different question. Let me use the dataset \
embedded in your ‘afex’ package as an example:
data(obk.long, package = "afex”)
Suppose that my base model is
lmer(value ~ gender*phase+(1|id), data=obk.long)
Is there a way to specify a different variance for each gender in one model?
Thanks,
Gang
On Jan 13, 2017, at 12:06 PM, Henrik Singmann \
<singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch><mailto:singmann@psychologie.uzh.ch<mailto:singmann@psychologie.uzh.ch>>> \
wrote:
Hi Gang,
Sorry that I so am late to the party, but in case you are still interested I will \
reply (and, of course, for the archive).
The answer is basically given in the old faq:
http://glmm.wikidot.com/faq#toc27
(1|site/block) = (1|site)+(1|site:block)
Which is exactly what is given in your output. A random intercept for Worker and a \
random intercept for each worker:Machine interaction.
To answer your questions. The random intercepts do not have base or reference levels. \
They are increments or decrements to the overall intercept for each level of Worker \
or the Machine:Worker combination. The reported variance is the estimated variance of \
these increments, which is most likely unequal to the actual variance you would \
obtain by calculating it from the estimated increments, which are sometimes called \
BLUPs (I wonder if a better term for those exist).
Hope that helps,
Henrik
PS: Belated Happy New Year to everyone.
Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R:
library(lme4)
data(Machines,package="nlme")
mydata <- Machines[Machines$Machine!='C’,]
With the following model:
print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var")
I have the variance components as shown below:
Random effects:
Groups Name Variance
Machine:Worker (Intercept) 46.00
Worker (Intercept) 13.84
Residual 1.16
I have trouble understanding exactly what the first two components are: \
Machine:Worker and Worker? Specifically,
1) What is the variance for Worker: corresponding to the base (or reference) level of \
the factor Machine? If so, what is the base level: the first level in the dataset or \
alphabetically the first level (it happens to be the same in this particular \
dataset)?
2) What is the variance for Machine:Worker? Is it the variance for the second level \
of the factor Machine, or the extra variance relative to the variance for Worker?
Furthermore, for the model:
print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var")
what is the variance for Machine:Worker in the following result since there are 3 \
levels involved in the factor Machine?
Random effects:
Groups Name Variance
Machine:Worker (Intercept) 60.2972
Worker (Intercept) 7.3959
Residual 0.9246
Thanks,
Gang
_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org><mailto:R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org>> \
mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models@r-project.org<mailto:R-sig-mixed-models@r-project.org> mailing \
list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic