[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oprofile-list
Subject:    [ oprofile-Bugs-1685267 ] (callgraph)opreport error: parse_filename()
From:       "SourceForge.net" <noreply () sourceforge ! net>
Date:       2007-04-26 19:56:12
Message-ID: E1HhA4a-000644-0b () sc8-sf-web21 ! sourceforge ! net
[Download RAW message or body]

Bugs item #1685267, was opened at 2007-03-21 11:07
Message generated for change (Comment added) made by rutt
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=116191&aid=1685267&group_id=16191

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: Accepted
Priority: 9
Private: No
Submitted By: Sami Farin (safari)
Assigned to: Nobody/Anonymous (nobody)
Summary: (callgraph)opreport error: parse_filename() invalid filename

Initial Comment:
when using callgraph feature, opreport fails with the following words:

# opreport -c -p /lib/modules/2.6.19.7-exec-shield-1/kernel/ -l -g /usr/bin/Xorg
opreport error: parse_filename() invalid filename [5]: \
/var/lib/oprofile/samples/current/{root}/var/lib/oprofile/samples/current/{root}/usr/b \
in/Xorg/{dep}/{anon}/12410.0xeb2000.0xeb3000/GLOBAL_POWER_EVENTS.45000.1.all.all.all/{ \
dep}/{root}/var/lib/oprofile/samples/current/{root}/usr/bin/Xorg/{dep}/{anon}/12410.0x \
eb2000.0xeb3000/GLOBAL_POWER_EVENTS.45000.1.all.all.all/{cg}/{root}/usr/local/bin/oprofiled/GLOBAL_POWER_EVENTS.45000.1.all.all.all


# opreport -v
opreport: oprofile 0.9.2 compiled on Mar 18 2007 13:06:58

I added [5] into parse_filename() to find out which of them failed.  [5] is the \
latest one (skip "{cg}").


----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-26 15:56

Message:
Logged In: YES 
user_id=113232
Originator: NO

I spoke too soon.  Can't figure out how to append a file.  So here is the
patch:

diff -dur /u/jruttenberg/tmp/junk/oprofile-0.9.2/libpp/profile_spec.cpp
./libpp/profile_spec.cpp
---
/u/jruttenberg/tmp/junk/oprofile-0.9.2/libpp/profile_spec.cpp	2005-08-07
07:15:51.000000000 -0400
+++ ./libpp/profile_spec.cpp	2007-04-26 11:55:08.234052000 -0400
@@ -357,14 +357,21 @@
 	if (!is_prefix(sub, "/{root}/") && !is_prefix(sub, "/{kern}/"))
 		return false;
 
-	filename_spec file_spec(filename);
-	if (spec.match(file_spec)) {
-		if (exclude_dependent && file_spec.is_dependent())
-			return false;
-		return true;
+	try 
+	{
+		filename_spec file_spec(filename);
+		if (spec.match(file_spec)) {
+			if (exclude_dependent && file_spec.is_dependent())
+				return false;
+			return true;
+		}
+		return false;
+	}
+	catch(invalid_argument arg)
+	{
+		cerr << "Caught invalid argument for" << filename << std::endl;
+		return false;
 	}
-
-	return false;
 }
 
 }  // anonymous namespace


----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-26 15:51

Message:
Logged In: YES 
user_id=113232
Originator: NO

I'm appending a patch which is a not very perfect workaround which at
least allows me to print a call graph profile for the problems case.  This
is actually the first
C++ program I've ever worked with were I was really glad for the use of
exceptions as an error mechanism.  I just caught the one that
parse_filename throws at the appropriate place.
Since this file was likely garbage, it just proceeded and made what looks
like a reasonable profile.

It would still be good to track down the kernel problem which is
responsible for the bad cookies, but this seems like not a terrible filter.
 Another approach would be to filter
these in oprofiled at the point where the bad cookies are returned from
the kernel.  If they point into the oprofile sample directory and have the
{...} components, we know they
can't be right.

----------------------------------------------------------------------

Comment By: Jason Kraftcheck (kraftcheck)
Date: 2007-04-26 15:04

Message:
Logged In: YES 
user_id=665142
Originator: NO

I'm encountering this bug also.
  
I am using an x86_86 build of oprofiled to profile x86_64 
executables and am not using NFS for anything.  

Also, I noticed that the time required before these invalid 
paths start showing up in the samples directory increases by 
a lot if oprofiled is compiled w/out optimization.



----------------------------------------------------------------------

Comment By: Sami Farin (safari)
Date: 2007-04-26 11:38

Message:
Logged In: YES 
user_id=13022
Originator: YES

No NFS involved.

When I debug some program/library, they get compiled with -ggdb3
-fno-omit-frame-pointer -O0.
Normally I compile with -O2 -g0 -fomit-frame-pointer ... .


----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-26 11:25

Message:
Logged In: YES 
user_id=113232
Originator: NO

Phil, actually run was of a program unrelated to oprofile or the daemon
itself.  The fact that "oprofiled" ended up in the cg field is the result
of some sort of bug in either the kernel or oprofiled.  (See below.)

There are a few facts which I realized might be relevant:

  1. The OS is x86-64, but the programs involved are 32 bit.
  2. Some of the programs and/or their libraries might reside on NFS
mounted filesystems.

I did some more debugging work. I added logging for create_cookie like
this:

fedora-1:daemon$ diff -du opd_cookie.c ~/tmp/oprofile-0.9.2/daemon
--- opd_cookie.c	2005-10-18 19:28:26.000000000 -0400
+++ /u/jruttenberg/tmp/oprofile-0.9.2/daemon/opd_cookie.c	2007-04-26
11:15:37.211232000 -0400
@@ -114,6 +114,8 @@
 		entry->ignored = 0;
 	} else {
 		entry->ignored = is_image_ignored(entry->name);
+                if (! entry->ignored)
+                  printf("cc: dcookie: entry->name %s\n", entry->name);
 	}
 
 	return entry;

and this ended up in the oprofiled.log:

cc: dcookie: entry->name
/var/lib/oprofile/samples/current/{root}/var/lib/oprofile/samples/current/{root}/var/t \
mp/install/bin/tile-sim/{dep}/{root}/var/tmp/install/bin/tile-sim/GLOBAL_POWER_EVENTS. \
100000.1.all.all.all/{dep}/{root}/var/lib/oprofile/samples/current/{root}/var/tmp/inst \
all/bin/tile-sim/{dep}/{root}/var/tmp/install/bin/tile-sim/GLOBAL_POWER_EVENTS.100000. \
1.all.all.all/{cg}/{root}/usr/local/bin/oprofiled/GLOBAL_POWER_EVENTS.100000.1.all.all.all



which seems quite wrong.

I'm liking the idea about the frame pointer.  All our code is compiled
with -g, but perhaps some of the libraries are not (they are mostly
stripped in any case.)

----------------------------------------------------------------------

Comment By: Philippe Elie (phil_e)
Date: 2007-04-25 22:47

Message:
Logged In: YES 
user_id=318973
Originator: NO

Rutt, I'm unsure about what you're talking, perhaps a different problem,
can you reproduce Safari's problem with a sample file unrelated to the
damon itself ?

For the problem reportd by Safari I suspect than some binary used by the
daemon (shared libs) are compiled w/o frame pointer. During the stack trace
the driver return some random invalid EIP. Some of these EIP point into one
sample file (sample file are mmaped by the daemon), so the name of a sample
files is used as a compoment of the new sample file which completly break
the name demangling. Look how the sample filename
/var/lib/oprofile/samples/current/{root}/usr/bin/Xorg/... is repeated
inside the complete sample filename.

I increased the priority of this bug but I've no idea how to fix it in a
clean way.

----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-25 20:49

Message:
Logged In: YES 
user_id=113232
Originator: NO

Still a little more work and I'm starting to form a theory.  Safari, was
there any NFS involved in your esample?  Where any of the programs you ran
or libraries they used on NFS file systems?


----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-25 12:54

Message:
Logged In: YES 
user_id=113232
Originator: NO

Perhaps someone isn't null terminating?

----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-25 12:53

Message:
Logged In: YES 
user_id=113232
Originator: NO

Added some logging to op_mangle_filename to see what the daemon was
thinking when it made these files.

In particular, I had it print out the three image, dep, and cg fields of
the mangle_values input.  Here is an example I got:

mf: image_name
/var/lib/oprofile/samples/current/{root}/home/scratch/jruttenberg/main2/install/bin/ti \
le-sim/{dep}/{anon}/25234.0xf7d04000.0xf7da8000/{cg}/{root}/home/scratch/jruttenberg/main2/install/bin/tile-sim/GLOBAL_POWER_EVENTS.100000.1.all.all.all
                
mf: dep_name
/var/lib/oprofile/samples/current/{root}/home/scratch/jruttenberg/main2/install/bin/ti \
le-sim/{dep}/{anon}/25234.0xf7d04000.0xf7da8000/{cg}/{root}/home/scratch/jruttenberg/main2/install/bin/tile-sim/GLOBAL_POWER_EVENTS.100000.1.all.all.all
                
mf: cg_image_name /usr/local/bin/oprofiled


How can this be right?

----------------------------------------------------------------------

Comment By: John Ruttenberg (rutt)
Date: 2007-04-25 12:21

Message:
Logged In: YES 
user_id=113232
Originator: NO

I've done a little work on this and desire a little guidance.

It seems that the given filename really is invalid (at least according to
parse_filename.
If I'm reading it correctly the filename must be in three parts:

1. The base spec which should be the path to the samples directory.  It is
terminated by {root} or {kern}.
2. The image filename, which I think should name the actual program which
was invoked.  It is terminated by {dep} UNLESS it starts with {anon} in
which case it's just one component.
3. The dependent filename, whatever that is.  It is terminated by {cg}
4. A call graph filename

Here we have:
1. Base = /var/lib/oprofile/samples/current
2. Image = /var/lib/oprofile/samples/current/{root}/usr/bin/Xorg
3. Dep =
{anon}/12410.0xeb2000.0xeb3000/GLOBAL_POWER_EVENTS.45000.1.all.all.all

But now instead of finding {cg} as we might expect, we have a new {dep}
filename.  This seems wrong.

So I looked at the daemon to try to figure out what's going on there.  The
function op_mangle_filename is getting this kind of mess passed to it in
its mangle_values parameter.

I'll investigate further, but come guidance is appreciated.  There are
references in the sources like PP:1 which seem like formal specs for these
files.  Where is this document?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=116191&aid=1685267&group_id=16191

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
oprofile-list mailing list
oprofile-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oprofile-list


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic