[prev in list] [next in list] [prev in thread] [next in thread]
List: r-sig-mixed-models
Subject: [R-sig-ME] Interpretation of lmer output in R
From: Julia Sommerfeld <Julia.Sommerfeld () utas ! edu ! au>
Date: 2011-02-21 19:44:34
Message-ID: AANLkTin4j3qKxt_5HyCX0EKh0d0etNnHEFyKdySfs7ZU () mail ! gmail ! com
[Download RAW message or body]
not sure if my email came through...
-------
Dear Douglas and list member,
Thank you heaps for your answers. The interpretation of the summary output
(lmer) is becoming much clearer now. I have to admit I had a slightly (not
to say HUGE) different idea of the summary output.
But a few questions still remain...
I have tried out the suggestions with the following results:
*1. I tested if "Sex" is an important factor in the model:
*
fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
fm1<-lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
anova(fm1,fm)
Data:
Models:
fm1: SameSite ~ BreedSuc1 + (1 | Bird)
fm: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
fm1 3 75.518 81.485 -34.759
fm 4 77.379 85.335 -34.690 0.1387 1 0.7096
*2. Since Sex is not "important", I compared fm1 with fm2 to test if
BreedSuc1 is an important factor:
*
fm2<-lmer(SameSite ~ 1 + (1|Bird), family="binomial")
anova(fm2,fm1)
Data:
Models:
fm2: SameSite ~ 1 + (1 | Bird)
fm1: SameSite ~ BreedSuc1 + (1 | Bird)
Df AIC BIC logLik Chisq Chi Df
Pr(>Chisq)
fm2 2 77.617 81.595 -36.808
fm1 3 75.518 81.485 -34.759 4.0991 1
0.04291 *
BUT: Ben (thanks for the input) disagreed here:
*"I'm afraid I have to disagree with Doug here. This kind of model
reduction, while seemingly sensible (and not as abusive as large-scale,
automated stepwise regression), is a mild form of data snooping. Don't do it
..." *
If I don't do it, what could be an alternative? Simply rely on the output of
fm1 without looking at fm2?
*3. From the output above, I conclude that BreedSuc1 has an effect on
SameSite:*
summary(fm1)
Generalized linear mixed model fit by the Laplace approximation
Formula: SameSite ~ BreedSuc1 + (1 | Bird)
AIC BIC logLik deviance
75.52 81.48 -34.76 69.52
Random effects:
Groups Name Variance Std.Dev.
Bird (Intercept) 0.12332 0.35117
Number of obs: 54, groups: Bird, 46
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.2135 0.3794 -0.563 0.5736
BreedSuc11 1.1831 0.5921 1.998 0.0457 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
BreedSuc11 -0.638
*Now, again the interpretation of the Fixed effects:
*
BreedSuc1 has a significant effect on SameSite.
#site-fidelity (SameSite=0) of "bird" (bird(!), since I droppped "Sex"?)
who was unsuccessful in breeding in the previous season (BreedSuc1=0),
corresponds to a probability of *44%:*
plogis(-0.2135)
[1] 0.4468268
Further, site-fidelity (SameSite=0) of "bird" who was successful in
breeding in the previous season (BreedSuc1=1), corresponds to a probability
of *72%:*
plogis(-0.2135 + 1.1831)
[1] 0.7250398
Is my interpretation correct? So this output is always based on SameSite=0?
(SameSite=0 means that birds did NOT change nest site, i.e. they show high
site-fidelity). But what if SameSite=1 would mean high site-fidelity (no
change of nests?)?
*4. if I don't drop the term "Sex":*
summary(fm)
Generalized linear mixed model fit by the Laplace approximation
Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
AIC BIC logLik deviance
77.38 85.34 -34.69 69.38
Random effects:
Groups Name Variance Std.Dev.
Bird (Intercept) 0.14080 0.37524
Number of obs: 54, groups: Bird, 46
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3294 0.4890 -0.674 0.5006
BreedSuc11 1.1988 0.5957 2.012 0.0442 *
Sex M 0.2215 0.5877 0.377 0.7062
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) BrdS11
BreedSuc11 -0.536
SexM -0.628 0.065
*site fidelity for a female bird (i.e. SexM is 0) who was unsuccessful in
breeding the previous season (i.e.
BreedSuc1 is 0) is -0.3294, corresponding to a probability of about 42%
> plogis(-0.3294)
[1] 0.4183866
The log-odds of site fidelity for a female bird who was successful in
breeding is -0.3294 + 1.1988, corresponding to a probability of about 70%
> plogis(-0.3294 + 1.1988)
[1] 0.7046208
*
*5. The z-value: Sorry, I still have some trouble with this value....*
In model fm (without "Sex") the z-value of BreedSuc1 corresponds to 1.998.
In model fm1("Sex" included) the z-value corresponds to 2.012.
Nearly the same value in both models. But what can someone conclude from: p<
0.05, z=1.998 ??? Because this is what many people write in their result
section (I was told to do so the same....).
*"You should see a LRT test statistic close to, but not exactly the same as,
the square of the z value when you compare the models with and without that
term.
This is the sense in which the z-value is an approximation".*
I don't really understand how the z-value can be seen as an approximation?
Am I missing some background knowledge here?
Again, thanks heaps for your help!!!!
Julia
2011/2/19 Douglas Bates <bates@stat.wisc.edu>
> Thank you for your questions and for transferring the discussion to
> the R-SIG-Mixed-Models mailing list, as we had discussed. I have also
> copied the mailing list for a class on mixed-effects models that I am
> teaching.
>
> I particularly appreciate your desire to learn about the model instead
> of just quoting a p-value. I often lament to my classes that
> statisticians have been far too successful in propagating the idea of
> p-values, to the extent that some researchers believe that is all that
> is needed to learn about an analysis.
>
> On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
> <Julia.Sommerfeld@utas.edu.au> wrote:
> > Dear Douglas and list members,
> >
> > Apologies in advance if you might consider my questions as too simple to
> > be asking the godfather of lme4 for an answer...thus, please feel free to
> > ignore my email or to forward it to someone else.
> >
> > I'm a PhD student (Australia/Germany) working on tropical seabirds. As
> > many of my PhD-collegues, I'm having some difficulties with the analysis
> > of my data using lmer (family=binomial). While some say: What do you care
> >
> > about all the other values as long as you've got a p-value... I do
> believe
> > that it is essential to understand WHAT I'm doing here and WHAT all these
> > numbers/values mean.
> >
> > I've read the Chapters (lme4 Book Chapters) and publications about the
> use
> > of lmer and searched the forums - but I don't find a satisfying answer.
> > And I have the feeling that 1. the statistic lecture at my university was
> > a joke (sad to say this) 2. that I need a huge statistical/mathematical
> > background to fully understand GLMMs.
> >
> >
> > One of the question I would like to answer is:
> > Does the previous breeding success influences nest site fidelity?
> >
> > I have binomial data:
> > SameSite=1 means birds use the same site
> >
> > SameSite=0 means birds change nest site
> >
> > BreedSuc1=1 Birds were successful in previous breeding season
> > BreedSuc1=0 Birds were not successful " " "
> >
> > Sex= male, female
> > Bird= Bird ID
> >
> > This is my model:
>
> > fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>
> > where Bird is my random factor (same birds were sampled more than once)
>
> One thing to note is that there are 46 different birds in the 54
> observations. Most birds will have just one observation so a random
> effect for bird may not be necessary.
>
> > summary(fm)
> >
> > Generalized linear mixed model fit by the Laplace approximation
> >
> > Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
> > AIC BIC logLik deviance
> > 77.38 85.34 -34.69 69.38
> > Random effects:
> > Groups Name Variance Std.Dev.
> > Bird (Intercept) 0.14080 0.37524
> > Number of obs: 54, groups: Bird, 46
> >
> > Fixed effects:
> > Estimate Std. Error z value Pr(>|z|)
> > (Intercept) -0.3294 0.4890 -0.674 0.5006
> > BreedSuc11 1.1988 0.5957 2.012 0.0442 *
> > SexM 0.2215 0.5877 0.377 0.7062
>
> this suggests that sex is not an important factor in the model. The
> (Intercept) term is close to zero, relative to its standard error, but
> we would retain it in the model as explained below.
>
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Correlation of Fixed Effects:
> > (Intr) BrdS11
> > BreedSuc11 -0.536
> > SexM -0.628 0.065
> >
> >
> > From this summary output I do understand that the Breeding Success has a
> > significant effect on nest-site fidelity (p<0.05).
>
> Yes, but ... this p-value should be used as a guide only. As
> described below a p-value must be viewed in context. It is not a
> property of the Breeding Success factor; it comes from a comparison of
> two models and we should bear in mind that these models are before
> interpreting this number.
>
> The interpretation of a p-value for a particular coefficient is that
> it is an approximation to the p-value we would get from comparing the
> model that has been fit to the mode fit without this particular
> coefficient. In this case the coefficient corresponds to one of the
> terms in the model and I would advocate performing a likelihood ratio
> test comparing the two models
>
> fm <- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
> fm0 <- glmer(SameSite~Sex+(1|Bird), family="binomial") # the null
> hypothesis model
> anova(fm0, fm)
>
> Even though the function is called anova it will, in this case,
> perform a likelihood ratio test (LRT). It also prints the values of
> AIC and BIC if you prefer to compare models according to one of those
> criteria but I prefer using the likelihood ratio for nested models.
>
> However, before doing that comparison you should ask yourself whether
> you want to compare models that have the, apparently unnecessary term
> for Sex in them. The way I would approach the model building is first
> to reduce the model to
>
> fm1 <- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>
> You could then compare
>
> anova(fm1, fm)
>
> which I presume will give a large p-value for the LRT, so we prefer
> the simpler model, fm1. After that, I would compare
>
> fm2 <- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
> anova(fm2, fm1)
>
> to see if the BreedSuc1 factor is an important predictor in its own right.
>
> Note that we don't drop the implicit "(Intercept)" term, even though
> it has a high p-value in the coefficient table. The reason is that
> the interpretation of the (Intercept) coefficient depends on the
> coding of BreedSuc1.
>
> In model fm, the log-odds of site fidelity for a female bird (i.e.
> SexM is 0) who was unsuccessful in breeding the previous season (i.e.
> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
> 42%
>
> > plogis(-0.3294)
> [1] 0.4183866
>
> The log-odds of site fidelity for a female bird who was successful in
> breeding is -0.3294 + 1.1988, corresponding to a probability of about
> 70%
>
> > plogis(-0.3294 + 1.1988)
> [1] 0.7046208
>
> If you had reversed the meaning of BreedSuc to BreedFail, where 0
> indicates no failure at breeding and 1 indicates failure, then the
> coefficient would change sign (i.e. the coefficient for BreedFail
> would be -1.1988) and the intercept would change to
>
> > -0.3294 + 1.1988
> [1] 0.8694
>
> because the reference level would now be a female bird who was
> successful in breeding.
>
> Because the interpretation of the intercept depends upon the coding of
> other factors, we retain it in the model whenever other terms are
> retained.
>
>
>
> > But what else can I conclude from this model?
> >
> > Questions:
> >
> > 1.Random effects: What does the Random Effect table - the Variance, Std.
> > Dev. and Intercept - tells me: Is there a random effect that my model has
> > to account for?
>
> First I would remove the apparently unnecessary Sex term then,
> ideally, I would check by comparing the fit of the reduced model to
> that of a GLM without the random effect for Bird. Unfortunately, I
> don't think the definition of deviance for a glm fit is compatible
> with that for a model fit by glmer. This is something we will need to
> fix. For the time being I would instead examine the "caterpillar
> plot" obtained with
>
> dotplot(ranef(fm1, postVar=TRUE))
>
> which represent the 95% prediction intervals for each of the birds.
> If these all overlap zero comfortably I would conclude that the random
> effect is not needed an fit a glm without a random effect for bird.
> > Random effects:
> > Groups Name Variance Std.Dev.
> > Bird (Intercept) 0.14080 0.37524
> > Number of obs: 54, groups: Bird, 46
>
> That estimated standard deviation is fairly large. We would expect a
> range of contributions on the log-odds scale of about +/- 2 sd which,
> at this point of the logistic curve corresponds to considerable
> variability in predicted probabilities for birds with the same
> characteristics.
>
> > 2. Fixed Effects: Again the Intercept? Not sure if I understand the
> > meaning of it (sorry, explanation in Chapter I also doesn't help much)
>
> Actually in this model it is a bit different from the models described
> in chapter 1. I hope the explanation above makes sense. Think of it
> as the log-odds of site fidelity for a bird in the "reference group"
> where reference group means that all the other variables are a the
> zero level.
>
> > Fixed effects:
> > Estimate Std. Error z value Pr(>|z|)
> > (Intercept) -0.3294 0.4890 -0.674 0.5006
> > BreedSuc11 1.1988 0.5957 2.012 0.0442 *
> >
> > SexM 0.2215 0.5877 0.377 0.7062
> >
> > 3. Meaning of the z-value? Why shall I mention it in te result section?
>
> I would regard the z-value as an approximation. The quantity of
> interest is the likelihood ratio test statistic which has a
> chi-squared distribution under the null hypothesis (i.e. the term can
> be deleted from the model without getting a significantly worse fit).
> It happens that this would be a chi-squared distribution with 1 degree
> of freedom, which corresponds to the square of a standard normal
> distribution. You should see a LRT test statistic close to, but not
> exactly the same as, the square of the z value when you compare the
> models with and without that term. This is the sense in which the
> z-value is an approximation. To me the LRT statistic is more reliable
> because it is based upon actually refitting the model.
>
> > 4. Estimate and Std. Error of the fixed effects? How can I tell from
> these
> > values WHAT kind of effect (positiv, negativ?) these parameter have on
> > nest-site fidelity? Do birds that were successful during the previous
> > breeding success show a higher nest-site fidelity? Remember, I have
> > binomial data...
>
> That is described above. If you want the estimate of the site
> fidelity for bird with certain characteristics you evaluate the
> corresponding combination of coefficients and apply plogis to the
> result.
> > I would highly appreciate your feedback and/or suggestions of
> >
> > papers/chapters I could read for a better understanding of the output.
> >
> > Best regards,
> >
> >
> > Julia
>
> I hope this helps.
>
--
Julia Sommerfeld - PhD Candidate
Institute for Marine and Antarctic Studies
University of Tasmania
Private Bag 129, Hobart
TAS 7001
Phone: +61 458 247 348
Email: julia.somma@gmx.de
Julia.Sommerfeld@utas.edu.au
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic