[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [valgrind] [Bug 79362] Debug info is lost for .so files when they are dlclose'd
From:       Philippe Waroquiers <bugzilla_noreply () kde ! org>
Date:       2017-08-07 19:28:38
Message-ID: bug-79362-17878-cUV5RkKyga () http ! bugs ! kde ! org/
[Download RAW message or body]

https://bugs.kde.org/show_bug.cgi?id=79362

--- Comment #73 from Philippe Waroquiers <philippe.waroquiers@skynet.be> ---
(In reply to Julian Seward from comment #72)
> (In reply to Philippe Waroquiers from comment #71)
> > Created attachment 107073 [details]
> > (hack) : patch that adds measurement code to scan the EC for a .so unload
> 
> +            for (j = 0; j < ec->n_ips; j++) {
> +               if (UNLIKELY(ec->ips[j] >= from && ec->ips[j] <= to)) {
> +                  break;
> +               }
> +            }
> 
> This is a side-effect-free loop whose only computed value (j) is unused,
> and provably terminates.  I think it's likely that gcc noticed all 3 facts
> and deleted the loop.
Yes, you are correct.
I will attach a new patch which acvoids the loop elimination.
With this, the scan is slower : for big applications (260_000 EC,
6_300_000 IP), a scan takes between 0.020 to 0.030 seconds.
So, an application that does 1000 load/unload will use around
20 seconds more cpu for the scanning.

Assuming these measurements are now ok; I think this still looks acceptable,
as:
  * a large majority of the users will not use this feature, so we
    better reduce the impact of this feature when not used
    (typically in memory)
  * for users of the functionality, we better make 
    --track-origins functionally correct
  * probably not many applications are doing load/unload at a
    high frequency.

If then we find an application that suffers heavily from this scannning,
then we can always implement a 'lazy scan': instead of scanning
all EC when a lib is unloaded, we just scan an EC when
EC is symbolised (or when it is 're-acquired' following the capture
of a new stack trace).
The logic of this 'lazy scanning' will be:
   if EC epoch is not the current epoch then
      scan all DI that were unloaded between EC epoch
       and current epoch.
      if an address of the EC matches one of these DI,
         mark the EC as archived (i.e. it should not
             be used for something else than output)
      else
         change EC epoch to be current epoch
With this, very few EC should have to be verified :
   only the 'active EC' and the EC related to an error.

But I think we better do first the simple approach of
scanning, and if a user of --keep-debuginfo complains
about performance, optimise by doing the lazy scanning.


--00:00:00:01.288 9148--    exectx: scanning 1000 times 515 contexts/5,133 ips
--00:00:00:01.294 9148--    exectx: finished scanning. Match 4 scanned
5,133,000  515 contexts/5,133 ips
--00:00:00:01.295 9151--    exectx: scanning 1000 times 1,027 contexts/11,277
ips
--00:00:00:01.309 9151--    exectx: finished scanning. Match 4 scanned
11,277,000  1,027 contexts/11,277 ips
--00:00:00:01.308 9154--    exectx: scanning 1000 times 2,051 contexts/24,589
ips
--00:00:00:01.345 9154--    exectx: finished scanning. Match 4 scanned
24,589,000  2,051 contexts/24,589 ips
--00:00:00:01.315 9157--    exectx: scanning 1000 times 4,099 contexts/53,261
ips
--00:00:00:01.417 9157--    exectx: finished scanning. Match 4 scanned
53,261,000  4,099 contexts/53,261 ips
--00:00:00:01.331 9160--    exectx: scanning 1000 times 8,195 contexts/114,701
ips
--00:00:00:01.576 9160--    exectx: finished scanning. Match 4 scanned
114,701,000  8,195 contexts/114,701 ips
--00:00:00:01.346 9163--    exectx: scanning 1000 times 16,387 contexts/245,773
ips
--00:00:00:01.860 9163--    exectx: finished scanning. Match 4 scanned
245,773,000  16,387 contexts/245,773 ips
--00:00:00:01.381 9166--    exectx: scanning 1000 times 32,771 contexts/524,301
ips
--00:00:00:02.576 9166--    exectx: finished scanning. Match 4 scanned
524,301,000  32,771 contexts/524,301 ips
--00:00:00:01.441 9169--    exectx: scanning 1000 times 65,539
contexts/1,114,125 ips
--00:00:00:05.809 9169--    exectx: finished scanning. Match 4 scanned
1,114,125,000  65,539 contexts/1,114,125 ips
--00:00:00:01.559 9172--    exectx: scanning 1000 times 131,076
contexts/2,359,327 ips
--00:00:00:15.639 9172--    exectx: finished scanning. Match 4 scanned
2,359,327,000  131,076 contexts/2,359,327 ips
--00:00:00:01.777 9175--    exectx: scanning 1000 times 262,149
contexts/4,980,787 ips
--00:00:00:28.796 9175--    exectx: finished scanning. Match 4 scanned
4,980,787,000  262,149 contexts/4,980,787 ips
--00:00:00:01.792 9178--    exectx: scanning 1000 times 262,149
contexts/5,242,933 ips
--00:00:00:32.021 9178--    exectx: finished scanning. Match 4 scanned
5,242,933,000  262,149 contexts/5,242,933 ips
--00:00:00:01.821 9182--    exectx: scanning 1000 times 262,149
contexts/6,291,514 ips
--00:00:00:21.031 9182--    exectx: finished scanning. Match 524296 scanned
6,291,514,000  262,149 contexts/6,291,514 ips


Note that I am slightly amazed by the fact that the last run is faster than
the 2 previous one. I cannot explain this (I checked, and the Match-es are all
because of the
'last ip' below main, which has a small value. So, the last run scans the same
nr of EC,
but about 1_000_000 IPs more (1000 times), and is faster ???

-- 
You are receiving this mail because:
You are watching all bug changes.=
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic