'[R-sig-ME] Interpretation of lmer output in R'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-sig-mixed-models
Subject:    [R-sig-ME]  Interpretation of lmer output in R
From:       Julia Sommerfeld <Julia.Sommerfeld () utas ! edu ! au>
Date:       2011-02-21 19:44:34
Message-ID: AANLkTin4j3qKxt_5HyCX0EKh0d0etNnHEFyKdySfs7ZU () mail ! gmail ! com
[Download RAW message or body]

not sure if my email came through...

-------

Dear Douglas and list member,

Thank you heaps for your answers. The interpretation of the summary output
(lmer) is becoming much clearer now. I have to admit I had a slightly (not
to say HUGE) different idea of the summary output.

But a few questions still remain...

I have tried out the suggestions with the following results:

*1. I tested if "Sex" is an important factor in the model:
*
fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
fm1<-lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
anova(fm1,fm)

Data:
Models:
fm1: SameSite ~ BreedSuc1 + (1 | Bird)
fm: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
    Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)
fm1  3 75.518 81.485 -34.759
fm   4 77.379 85.335 -34.690 0.1387      1     0.7096

*2. Since Sex is not "important",  I compared fm1 with fm2 to test if
BreedSuc1 is an important factor:
*
fm2<-lmer(SameSite ~ 1 + (1|Bird), family="binomial")
anova(fm2,fm1)

Data:
Models:
fm2: SameSite ~ 1 + (1 | Bird)
fm1: SameSite ~ BreedSuc1 + (1 | Bird)
        Df    AIC        BIC      logLik      Chisq     Chi Df
 Pr(>Chisq)
fm2  2     77.617     81.595     -36.808
fm1  3     75.518      81.485     -34.759     4.0991          1
 0.04291 *

BUT: Ben (thanks for the input) disagreed here:
*"I'm afraid I have to disagree with Doug here. This kind of model
reduction, while seemingly sensible (and not as abusive as large-scale,
automated stepwise regression), is a mild form of data snooping. Don't do it
..." *

If I don't do it, what could be an alternative? Simply rely on the output of
fm1 without looking at fm2?

*3.  From the output above, I conclude that BreedSuc1 has an effect on
SameSite:*

summary(fm1)

Generalized linear mixed model fit by the Laplace approximation
Formula: SameSite ~ BreedSuc1 + (1 | Bird)
   AIC       BIC         logLik     deviance
 75.52     81.48     -34.76    69.52
Random effects:
 Groups Name            Variance     Std.Dev.
 Bird   (Intercept)         0.12332      0.35117
Number of obs: 54, groups: Bird, 46

Fixed effects:
                    Estimate     Std. Error     z value     Pr(>|z|)
(Intercept)      -0.2135       0.3794          -0.563        0.5736
BreedSuc11    1.1831         0.5921           1.998        0.0457 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
           (Intr)
BreedSuc11 -0.638

*Now, again the interpretation of the Fixed effects:
*
BreedSuc1 has a significant effect on SameSite.

#site-fidelity (SameSite=0)  of "bird" (bird(!), since I droppped "Sex"?)
who was unsuccessful in breeding in the previous season (BreedSuc1=0),
corresponds to a probability of *44%:*
plogis(-0.2135)
[1] 0.4468268

Further, site-fidelity (SameSite=0)  of "bird" who was successful in
breeding in the previous season (BreedSuc1=1), corresponds to a probability
of *72%:*
plogis(-0.2135 + 1.1831)
[1] 0.7250398

Is my interpretation correct? So this output is always based on SameSite=0?
(SameSite=0 means that birds did NOT change nest site, i.e. they show high
site-fidelity). But what if SameSite=1 would mean high site-fidelity (no
change of nests?)?

*4. if I don't drop the term "Sex":*

summary(fm)

Generalized linear mixed model fit by the Laplace approximation
Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
   AIC   BIC logLik deviance
 77.38 85.34 -34.69    69.38
Random effects:
 Groups Name        Variance     Std.Dev.
 Bird   (Intercept)     0.14080       0.37524
Number of obs: 54, groups: Bird, 46

Fixed effects:
                    Estimate         Std. Error     z value     Pr(>|z|)
(Intercept)     -0.3294         0.4890         -0.674     0.5006
BreedSuc11   1.1988          0.5957        2.012      0.0442 *
Sex    M             0.2215           0.5877           0.377      0.7062
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
           (Intr)             BrdS11
BreedSuc11     -0.536
SexM           -0.628      0.065

*site fidelity for a female bird (i.e. SexM is 0) who was unsuccessful in
breeding the previous season (i.e.
BreedSuc1 is 0) is -0.3294, corresponding to a probability of about 42%
> plogis(-0.3294)
[1] 0.4183866

The log-odds of site fidelity for a female bird who was successful in
breeding is -0.3294 + 1.1988, corresponding to a probability of about 70%
> plogis(-0.3294 + 1.1988)
[1] 0.7046208
*

*5. The z-value: Sorry, I still have some trouble with this value....*

In model fm (without "Sex") the z-value of BreedSuc1 corresponds to 1.998.

In model fm1("Sex" included) the z-value corresponds to 2.012.

Nearly the same value in both models. But what can someone conclude from: p<
0.05, z=1.998 ??? Because this is what many people write in their result
section (I was told to do so the same....).

*"You should see a LRT test statistic close to, but not exactly the same as,
the square of the z value when you compare the models with and without that
term.
This is the sense in which the z-value is an approximation".*

I don't really understand how the z-value can be seen as an approximation?
Am I missing some background knowledge here?

Again, thanks heaps for your help!!!!

Julia

2011/2/19 Douglas Bates <bates@stat.wisc.edu>

> Thank you for your questions and for transferring the discussion to
> the R-SIG-Mixed-Models mailing list, as we had discussed.  I have also
> copied the mailing list for a class on mixed-effects models that I am
> teaching.
>
> I particularly appreciate your desire to learn about the model instead
> of just quoting a p-value.  I often lament to my classes that
> statisticians have been far too successful in propagating the idea of
> p-values, to the extent that some researchers believe that is all that
> is needed to learn about an analysis.
>
> On Sat, Feb 19, 2011 at 3:05 AM, Julia Sommerfeld
> <Julia.Sommerfeld@utas.edu.au> wrote:
> > Dear Douglas and list members,
> >
> > Apologies in advance if you might consider my questions as too simple to
> > be asking the godfather of lme4 for an answer...thus, please feel free to
> > ignore my email or to forward it to someone else.
> >
> > I'm a PhD student (Australia/Germany) working on tropical seabirds. As
> > many of my PhD-collegues, I'm having some difficulties with the analysis
> > of my data using lmer (family=binomial). While some say: What do you care
> >
> > about all the other values as long as you've got a p-value... I do
> believe
> > that it is essential to understand WHAT I'm doing here and WHAT all these
> > numbers/values mean.
> >
> > I've read the Chapters (lme4 Book Chapters) and publications about the
> use
> > of lmer and searched the forums - but I don't find a satisfying answer.
> > And I have the feeling that 1. the statistic lecture at my university was
> > a joke (sad to say this) 2. that I need a huge statistical/mathematical
> > background to fully understand GLMMs.
> >
> >
> > One of the question I would like to answer is:
> > Does the previous breeding success influences nest site fidelity?
> >
> > I have binomial data:
> > SameSite=1 means birds use the same site
> >
> > SameSite=0 means birds change nest site
> >
> > BreedSuc1=1 Birds were successful in previous breeding season
> > BreedSuc1=0 Birds were not successful "     "        "
> >
> > Sex= male, female
> > Bird= Bird ID
> >
> > This is my model:
>
> > fm<-lmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
>
> > where Bird is my random factor (same birds were sampled more than once)
>
> One thing to note is that there are 46 different birds in the 54
> observations.  Most birds will have just one observation so a random
> effect for bird may not be necessary.
>
> > summary(fm)
> >
> > Generalized linear mixed model fit by the Laplace approximation
> >
> > Formula: SameSite ~ BreedSuc1 + Sex + (1 | Bird)
> >   AIC   BIC logLik deviance
> >  77.38 85.34 -34.69    69.38
> > Random effects:
> >  Groups Name        Variance Std.Dev.
> >  Bird   (Intercept) 0.14080  0.37524
> > Number of obs: 54, groups: Bird, 46
> >
> > Fixed effects:
> >            Estimate Std. Error z value Pr(>|z|)
> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
> > SexM          0.2215     0.5877   0.377   0.7062
>
> this suggests that sex is not an important factor in the model.  The
> (Intercept) term is close to zero, relative to its standard error, but
> we would retain it in the model as explained below.
>
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Correlation of Fixed Effects:
> >           (Intr) BrdS11
> > BreedSuc11 -0.536
> > SexM       -0.628  0.065
> >
> >
> > From this summary output I do understand that the Breeding Success has a
> > significant effect on nest-site fidelity (p<0.05).
>
> Yes, but ... this p-value should be used as a guide only.  As
> described below a p-value must be viewed in context.  It is not a
> property of the Breeding Success factor; it comes from a comparison of
> two models and we should bear in mind that these models are before
> interpreting this number.
>
> The interpretation of a p-value for a particular coefficient is that
> it is an approximation to the p-value we would get from comparing the
> model that has been fit to the mode fit without this particular
> coefficient.  In this case the coefficient corresponds to one of the
> terms in the model and I would advocate performing a likelihood ratio
> test comparing the two models
>
> fm <- glmer(SameSite~BreedSuc1+Sex+(1|Bird), family="binomial")
> fm0 <- glmer(SameSite~Sex+(1|Bird), family="binomial")  # the null
> hypothesis model
> anova(fm0, fm)
>
> Even though the function is called anova it will, in this case,
> perform a likelihood ratio test (LRT).  It also prints the values of
> AIC and BIC if you prefer to compare models according to one of those
> criteria but I prefer using the likelihood ratio for nested models.
>
> However, before doing that comparison you should ask yourself whether
> you want to compare models that have the, apparently unnecessary term
> for Sex in them.  The way I would approach the model building is first
> to reduce the model to
>
> fm1 <- lmer(SameSite~BreedSuc1+(1|Bird), family="binomial")
>
> You could then compare
>
> anova(fm1, fm)
>
> which I presume will give a large p-value for the LRT, so we prefer
> the simpler model, fm1.  After that, I would compare
>
> fm2 <- lmer(SameSite ~ 1 + (1|Bird), family="binomial")
> anova(fm2, fm1)
>
> to see if the BreedSuc1 factor is an important predictor in its own right.
>
> Note that we don't drop the implicit "(Intercept)" term, even though
> it has a high p-value in the coefficient table.  The reason is that
> the interpretation of the (Intercept) coefficient depends on the
> coding of BreedSuc1.
>
> In model fm, the log-odds of site fidelity for a female bird (i.e.
> SexM is 0) who was unsuccessful in breeding the previous season (i.e.
> BreedSuc1 is 0) is -0.3294, corresponding to a probability of about
> 42%
>
> > plogis(-0.3294)
> [1] 0.4183866
>
> The log-odds of site fidelity for a female bird who was successful in
> breeding is -0.3294 + 1.1988, corresponding to a probability of about
> 70%
>
> > plogis(-0.3294 + 1.1988)
> [1] 0.7046208
>
> If you had reversed the meaning of BreedSuc to BreedFail, where 0
> indicates no failure at breeding and 1 indicates failure, then the
> coefficient would change sign (i.e. the coefficient for BreedFail
> would be -1.1988) and the intercept would change to
>
> > -0.3294 + 1.1988
> [1] 0.8694
>
> because the reference level would now be a female bird who was
> successful in breeding.
>
> Because the interpretation of the intercept depends upon the coding of
> other factors, we retain it in the model whenever other terms are
> retained.
>
>
>
> > But what else can I conclude from this model?
> >
> > Questions:
> >
> > 1.Random effects: What does the Random Effect table - the Variance, Std.
> > Dev. and Intercept - tells me: Is there a random effect that my model has
> > to account for?
>
> First I would remove the apparently unnecessary Sex term then,
> ideally, I would check by comparing the fit of the reduced model to
> that of a GLM without the random effect for Bird.  Unfortunately, I
> don't think the definition of deviance for a glm fit is compatible
> with that for a model fit by glmer.  This is something we will need to
> fix.  For the time being I would instead examine the "caterpillar
> plot" obtained with
>
> dotplot(ranef(fm1, postVar=TRUE))
>
> which represent the 95% prediction intervals for each of the birds.
> If these all overlap zero comfortably I would conclude that the random
> effect is not needed an fit a glm without a random effect for bird.
> > Random effects:
> >  Groups Name        Variance Std.Dev.
> >  Bird   (Intercept) 0.14080  0.37524
> > Number of obs: 54, groups: Bird, 46
>
> That estimated standard deviation is fairly large. We would expect a
> range of contributions on the log-odds scale of about +/- 2 sd which,
> at this point of the logistic curve corresponds to considerable
> variability in predicted probabilities for birds with the same
> characteristics.
>
> > 2. Fixed Effects: Again the Intercept? Not sure if I understand the
> > meaning of it (sorry, explanation in Chapter I also doesn't help much)
>
> Actually in this model it is a bit different from the models described
> in chapter 1.  I hope the explanation above makes sense.  Think of it
> as the log-odds of site fidelity for a bird in the "reference group"
> where reference group means that all the other variables are a the
> zero level.
>
> > Fixed effects:
> >            Estimate Std. Error z value Pr(>|z|)
> > (Intercept)  -0.3294     0.4890  -0.674   0.5006
> > BreedSuc11    1.1988     0.5957   2.012   0.0442 *
> >
> > SexM          0.2215     0.5877   0.377   0.7062
> >
> > 3. Meaning of the z-value? Why shall I mention it in te result section?
>
> I would regard the z-value as an approximation.  The quantity of
> interest is the likelihood ratio test statistic which has a
> chi-squared distribution under the null hypothesis (i.e. the term can
> be deleted from the model without getting a significantly worse fit).
> It happens that this would be a chi-squared distribution with 1 degree
> of freedom, which corresponds to the square of a standard normal
> distribution.  You should see a LRT test statistic close to, but not
> exactly the same as, the square of the z value when you compare the
> models with and without that term.  This is the sense in which the
> z-value is an approximation.  To me the LRT statistic is more reliable
> because it is based upon actually refitting the model.
>
> > 4. Estimate and Std. Error of the fixed effects? How can I tell from
> these
> > values WHAT kind of effect (positiv, negativ?) these parameter have on
> > nest-site fidelity? Do birds that were successful during the previous
> > breeding success show a higher nest-site fidelity? Remember, I have
> > binomial data...
>
> That is described above.  If you want the estimate of the site
> fidelity for bird with certain characteristics you evaluate the
> corresponding combination of coefficients and apply plogis to the
> result.
> > I would highly appreciate your feedback and/or suggestions of
> >
> > papers/chapters I could read for a better understanding of the output.
> >
> > Best regards,
> >
> >
> > Julia
>
> I hope this helps.
>

-- 
Julia Sommerfeld - PhD Candidate
Institute for Marine and Antarctic Studies
University of Tasmania
Private Bag 129, Hobart
TAS 7001

Phone: +61 458 247 348
Email: julia.somma@gmx.de
Julia.Sommerfeld@utas.edu.au

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[prev in list] [next in list] [prev in thread] [next in thread]