[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oprofile-list
Subject:    Re: Bug in the armv7 code
From:       Jean Pihet <jpihet () mvista ! com>
Date:       2009-12-02 19:24:27
Message-ID: 200912022024.27140.jpihet () mvista ! com
[Download RAW message or body]

On Wednesday 02 December 2009 19:12:11 Vitali Lovich wrote:
> On Dec 2, 2009, at 9:51 AM, Jean Pihet wrote:
> > Hi,
> >
> > On Wednesday 02 December 2009 18:29:42 Vitali Lovich wrote:
> >> Hey, I noticed a bug in arch/arm/oprofile/op_model_v7.c using
> >> performance counters.
> >>
> >> When the counter value is used, it is set in user space (e.g. 50000 on a
> >> 500 MHz CPU for 10ms sampling).  However, armv7 has this feature (which
> >> is enabled by oprofile) where the CPU counter can be updated at a lower
> >> frequency (in this case once every 64 clock cycles - PMNC_D).  Thus the
> >> user-space sets the amount of clock ticks they want, but they actually
> >> receive 64-times fewer on armv7.
> >
> > The PMNC_D bit is set because of a HW bug in the cortex-A8 chips release
> > r1..r3 (basically all chips currently on the market). The OMAP errata is
> > #628216. In short this is causing PMNC instability that cannot be
> > workarounded by SW. Cf. thread on linux-omap mailing list at
> > http://www.mail-archive.com/linux-omap@vger.kernel.org/msg14084.html.
>
> I understand this.  Thanks for the pointer to the errata - I can't seem to
> locate the a8 errata list on my machine at the moment, so I'll have to read
> up on it later.
In short it says that if you are using the cp15 at the moment of the PMNC 
counter(s) overflow, the PMNC unit state becomes undefined. In real life it 
locks up or resets.

> >> Probably the correct fix would be to have a abstract fixup function that
> >> gets called by the generic oprofile code when it reads the userspace
> >> value (so that any architecture that needs to convert a userspace value
> >> to a hardware specific value can do so)
> >>
> >> Currently, I have a hack in the reset function (which gets called every
> >> time the performance counter hits the requested sample rate).
> >>
> >> 154			if (PMNC_D) {
> >> 156				val = -((-val) / 64);
> >> 158			}
> >
> > Although this solves the timing problem from the user space point of
> > view, it increases the risk of instability. The real fix is to unset the
> > PMNC_D bit when the HW will be corrected.
> > Unfortunately the only way to have stable PC profiling is to use the
> > timer mode, not the PMNC.
>
> Can you please clarify how this particular change increases the risk of
> instability? If this is a general problem where the PMNC is unusable, then
> shouldn't it be not available on buggy hardware thereby forcing userspace
> to fall-back to timers?
It increases the risk since you get more overflow of the counters.
That is correct that the default option should be to fall back on the timer 
mode. To do that you can disable armv7 oprofile support from the kernel 
config.

> The current behaviour seems wrong regardless; the actual sample rate does
> not match even on hardware that is correct while making buggy hardware seem
> like it works when it doesn't.
You are right! I will submit a patch for that.
Unfortunately there is no non-buggy HW available (AFAIK) to test the 
change ;-(

>
> Thanks,
> Vitali

Jean

> ---------------------------------------------------------------------------
>--- Join us December 9, 2009 for the Red Hat Virtual Experience,
> a free event focused on virtualization and cloud computing.
> Attend in-depth sessions from your desk. Your couch. Anywhere.
> http://p.sf.net/sfu/redhat-sfdev2dev
> _______________________________________________
> oprofile-list mailing list
> oprofile-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oprofile-list



------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
oprofile-list mailing list
oprofile-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oprofile-list
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic