[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oprofile-list
Subject:    Re: Can oprofile collect call stack information?
From:       David Smith <dsmith () algonet ! se>
Date:       2002-07-10 11:00:03
[Download RAW message or body]

John Levon wrote:
>
> On Wed, Jul 10, 2002 at 12:48:44PM +0200, David wrote:
>
> > Do you have any plans to add profiling of functions in the call
> > stack for things like "total time in function + children" or call
> > graph information?
> 
> If you have an efficient way of storing call chains, and a reasonably
> robust way of getting this working in a system where some code may be
> missing frame pointers, please tell ...
> 

I'm afraid I don't. But then, I havn't written an advanced profiler
myself... I thought you might :-)

On the other hand, if it *can* be made to work by compiling relevant
parts of the system with frame pointers, then it might still be usable
if there is a reliable and quick way to assess the quality of the
backtraces. Then the program could give a message like "Sorry, 59% of
backtraces failed, no call graphs for you!" if it doesn't work.
But I'm speculating... I don't really know how to check that.

> > I tried doing this for myself, and managed a small hack for the
> > "total time..." feature. It _seemed_ to work well (using rtc
> > and a whole system compiled with frame pointers).
> 
> What changes did you make ?
> 

Well, hope you don't expect anything special... this is about it. And 
it absolutely requires frame pointers! In module.c:

  void regparm3 op_do_profile_0(uint cpu, struct pt_regs *regs, int ctr) 
  {
      /* this is the original op_do_profile */
  }

  struct frame {
      unsigned long ebp; /* next frame */
      unsigned long ret; /* return address */
  };

  /*  new op_do_profile with backtrace  */
  void regparm3 op_do_profile(uint cpu, struct pt_regs *regs, int ctr)
  {
      unsigned long old_eip = regs->eip;
      unsigned long old_ebp = regs->ebp;
      struct frame *frame = (struct frame *)regs->ebp;

      /* time in the function itself = ctr 0 */
      /* time in the function + children = ctr 1 */
      op_do_profile_0(cpu, regs, 0);
      op_do_profile_0(cpu, regs, 1);

      while ( frame ) {
          if ( get_user(regs->eip, &(frame->ret)) )
             break;
          if ( get_user(regs->ebp, &(frame->ebp)) )
              break;
          op_do_profile_0(cpu, regs, 1);
          frame = (struct frame *)frame->ebp;
      }
      regs->eip = old_eip;
      regs->ebp = old_ebp;
  }

Nothing surprising there. What *did* surprise me was that when I ran a
small test program the results were correct.:) Then I tried profiling
a "real" program (KDE/konqueror) and, while I can't really check those
numbers, they still looked reasonable.
So my thought was, could it really be this easy? I don't know, I'm very
much an amature in this area. Is it possible that this could be turned 
into something usable?

/David


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic