[prev in list] [next in list] [prev in thread] [next in thread]
List: oprofile-list
Subject: Re: Fwd: no sample when profiling ARM Cortex-A9 with Linux kernel 3.3
From: Maynard Johnson <maynardj () us ! ibm ! com>
Date: 2013-04-29 14:08:48
Message-ID: 517E7EF0.6040405 () us ! ibm ! com
[Download RAW message or body]
On 04/27/2013 03:44 AM, RocChen wrote:
> Thank you for your enthusiastic guidance, sir:
>
> I could get nice profiling results with the command list. (but with another issue I \
> can not understand, described below)
> Command List:
> > rm -rf /var/lib/oprofile
> > rm -rf /root/.oprofile
> > opcontrol --init
> > opcontrol --no-vmlinux
> > opcontrol --setup --event=CPU_CYCLES:10000 --separate=lib,kernel
> > opcontrol --start --image=all
> > ./array
> > ./../mpeg2dec/oprofile_results/mpeg2dec -b \
> > ../mpeg2dec/input_base/input_base_4CIF_96bps.mpg -o3 output_base_4CIF_96bps_%03d \
> > opcontrol --dump opreport -l array
> > opreport -l ./../mpeg2dec/oprofile_results/mpeg2dec
>
> Nice Results:
> [root]$ opreport -l array
> Using /var/lib/oprofile/samples/ for samples directory.
> warning: /no-vmlinux could not be found.
> CPU: ARM Cortex-A9, speed 1998 MHz (estimated)
> Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count \
> 10000 samples % image name symbol name
> 70547 95.8858 array slow_multiply
> 1181 1.6052 array fast_multiply
> 983 1.3361 array main
> 828 1.1254 no-vmlinux /no-vmlinux
> 29 0.0394 ld-2.13.so <http://ld-2.13.so> \
> /lib/arm-linux-gnueabi/ld-2.13.so <http://ld-2.13.so> 6 0.0082 \
> libc-2.13.so <http://libc-2.13.so> /lib/arm-linux-gnueabi/libc-2.13.so \
> <http://libc-2.13.so> [root]$ opreport -l mpeg2decode
> Using /var/lib/oprofile/samples/ for samples directory.
> warning: /no-vmlinux could not be found.
> CPU: ARM Cortex-A9, speed 1998 MHz (estimated)
> Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count \
> 10000 samples % image name symbol name
> 23899 16.9694 mpeg2decode conv420to422
> 23648 16.7912 mpeg2decode store_ppm_tga
> 16695 11.8542 mpeg2decode conv422to444
> 16072 11.4119 mpeg2decode Decode_Picture
> 15934 11.3139 mpeg2decode Fast_IDCT
> 15133 10.7451 no-vmlinux /no-vmlinux
> 14614 10.3766 mpeg2decode putbyte
> 9260 6.5750 mpeg2decode form_component_prediction
> 1631 1.1581 mpeg2decode Flush_Buffer
> 825 0.5858 mpeg2decode Decode_MPEG2_Intra_Block
> 481 0.3415 mpeg2decode form_prediction.constprop.0
> 415 0.2947 mpeg2decode Decode_MPEG2_Non_Intra_Block
> 304 0.2159 mpeg2decode Get_Bits
> 200 0.1420 mpeg2decode Show_Bits
> 195 0.1385 mpeg2decode macroblock_modes
> ........
>
> (*) However, there is an issue that I can hardly understand: after I repeated the \
> commands list above several times (with the same event and sample rate, just \
> exactly the same command sequences), it may give the different results, the total \
> sample numbers are only several tens and no samples for user application functions.
> From the sequence of commands listed above, I don't see that you ever issue any \
> commands to stop, shutdown or deinit oprofile. If you simply leave oprofile \
> running and issue another opcontrol --start' command, that should basically be a \
> NOP, so I'm not aware of any oprofile userspace issues with what you're doing. \
> Have you tried any of the '--stop', '--shutdown', or '--deinit' options between \
> profiling runs?
-Maynard
>
> In my thought, the profiling shoud be at least similar to above nice results.
>
> Bad results:
> [root]$ opreport -l array
> Using /var/lib/oprofile/samples/ for samples directory.
> warning: /no-vmlinux could not be found.
> CPU: ARM Cortex-A9, speed 1998 MHz (estimated)
> Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count \
> 10000 samples % image name symbol name
> 63 92.6471 no-vmlinux /no-vmlinux
> 5 7.3529 libc-2.13.so <http://libc-2.13.so> \
> /lib/arm-linux-gnueabi/libc-2.13.so <http://libc-2.13.so> (0)-(linaro-chenp)-[Sat \
> Apr 27][15:06:36]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/workspace]
> [root]$ opreport -l mpeg2decode
> Using /var/lib/oprofile/samples/ for samples directory.
> warning: /no-vmlinux could not be found.
> CPU: ARM Cortex-A9, speed 1998 MHz (estimated)
> Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count \
> 10000 samples % image name symbol name
> 69 95.8333 no-vmlinux /no-vmlinux
> 3 4.1667 libc-2.13.so <http://libc-2.13.so> \
> /lib/arm-linux-gnueabi/libc-2.13.so <http://libc-2.13.so>
> Besides, after this certain time, all following profiling with oprofile will give \
> such kind of oprofile results. I can get the above nice profiling results again \
> only when I reboot the system. I conducted dozen times of experiments. The repeat \
> number of the above command sequence after which the results turn to the 'bad' kind \
> (only tens of sample of kernel) is not regular. It just suddenly becomes such a \
> situation after seval times of profiling.
> Hope I describe the problem clearly
>
> Regards
>
>
> On Sat, Apr 27, 2013 at 12:07 AM, Maynard Johnson <maynardj@us.ibm.com \
> <mailto:maynardj@us.ibm.com>> wrote:
> On 04/26/2013 09:25 AM, RocChen wrote:
> >
> > Very sorry for forgetting cc to the maillist~
> >
> >
> > ---------- Forwarded message ----------
> > From: *RocChen* <singleroc@gmail.com <mailto:singleroc@gmail.com> \
> > <mailto:singleroc@gmail.com <mailto:singleroc@gmail.com>>>
> > Date: Fri, Apr 26, 2013 at 10:14 PM
> > Subject: Re: no sample when profiling ARM Cortex-A9 with Linux kernel 3.3
> > To: Koteswararao Nelakurthi <knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com> <mailto:knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com>>>
> >
> > Hai,Koteswararao
> >
> > Thanks for for quick reply.
> >
> > Here is my profiling procedure (opcontrol: oprofile 0.9.7 compiled on Apr 26 2013 \
> > 08:47:51):
> > [root]$ rm -rf /var/lib/oprofile/
> > (0)-(linaro)-[Fri Apr \
> > 26][22:03:05]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ rm -rf ~/.oprofile/
> > (0)-(linaro)-[Fri Apr \
> > 26][22:03:13]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ opcontrol --init
> > (0)-(linaro)-[Fri Apr \
> > 26][22:03:29]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ opcontrol --setup --event=CPU_CYCLES:1000 \
> > --separate=all --no-vmlinux
>
> The '--separate=all' option categorizes your samples by kernel, library, thread, \
> and CPU. The 'thread' and 'cpu' categorization is rarely needed and just leads to \
> confusion when trying to generate reports. Use '--separate=lib,kernel' instead.
> CPU_CYCLES:1000 is a very high sampling rate, which is undoubtedly why you get the \
> "WARNING! The OProfile kernel driver reports sample buffer overflows" message. I \
> recommend a count of 100000 (or maybe even higher) versus 1000.
> > (0)-(linaro)-[Fri Apr \
> > 26][22:04:20]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ opcontrol --start \
> > --image=../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode
>
> For starters, don't use the '--image' option. Please revert that with 'opcontrol \
> --image=all' and try again.
> -Maynard
> > Using 2.6+ OProfile kernel interface.
> > Using log file /var/lib/oprofile/samples/oprofiled.log
> > Daemon started.
> > Profiler running.
> > (0)-(linaro)-[Fri Apr \
> > 26][22:05:25]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ time \
> > ./../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode -b input_base_4CIF_96bps.mpg -o3 \
> > output_base_4CIF_96bps_ \
> > %03d saving output_base_4CIF_96bps_000.ppm
> > saving output_base_4CIF_96bps_001.ppm
> > saving output_base_4CIF_96bps_002.ppm
> > saving output_base_4CIF_96bps_003.ppm
> > saving output_base_4CIF_96bps_004.ppm
> > saving output_base_4CIF_96bps_005.ppm
> > saving output_base_4CIF_96bps_006.ppm
> > saving output_base_4CIF_96bps_007.ppm
> > saving output_base_4CIF_96bps_008.ppm
> >
> > real 0m1.593s
> > user 0m1.490s
> > sys 0m0.100s
> > (0)-(linaro)-[Fri Apr \
> > 26][22:05:54]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ opcontrol --dump
> > (0)-(linaro)-[Fri Apr \
> > 26][22:06:06]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe \
> > g2dec/oprofile_results] [root]$ opreport -l \
> > ../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode WARNING! The OProfile kernel driver \
> > reports sample buffer overflows. Such overflows can result in incorrect sample \
> > attribution, invalid sample files and other symptoms. See the oprofiled.log for \
> > details. You should adjust your sampling frequency to eliminate (or at least \
> > minimize) these overflows.
> > error: no sample files found: profile specification too strict ?
> >
> > ******************************************
> > I review the dmesg log for something related with the pmu:
> >
> > > dmesg | grep PMU
> > hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters available
> > > dmesg | grep pmu
> > registering platform device 'arm-pmu' id 0
> >
> >
> >
> >
> > On Fri, Apr 26, 2013 at 9:52 PM, Koteswararao Nelakurthi <knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com> <mailto:knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com>>> wrote:
> > > > opcontrol --start --image=<application name>
> > Provide binary application name .
> > ex. opcontrol --start --image=array
> >
> > Regards
> > koteswararao
> >
> >
> > On Fri, Apr 26, 2013 at 7:17 PM, Koteswararao Nelakurthi <knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com> <mailto:knelakurthi@mvista.com \
> > <mailto:knelakurthi@mvista.com>>> wrote:
> > Dear RocChen,
> >
> > I hope i understood your situation.From the log your showed
> > in previous mail, your are successfully updated the oprofile
> > userland tool.
> > Coming to profiling of applications, you need do it as below
> >
> > rm -rf /var/lib/oprofile/
> > rm -rf /root/.oprofile
> >
> > opcontrol --start --image=<application name>
> >
> > gcc -g <applicationname> -o <binary application name>
> > ex: gcc -g array.c -o array
> >
> > ./applicationame
> > ex. ./array
> >
> > opcontrol --dump
> >
> > opreport
> >
> > array is sample application which is simply doing some multiplication etc.
> > you can use any application that will put load over CPU so that i can use
> > H/W counter to count the samples.
> >
> > Light Load application will not generate Events and CPU can't use much
> > of it's time to it and hence samples might not be generated.
> >
> >
> > Regards
> > koteswararao
> >
> >
> >
>
>
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
oprofile-list mailing list
oprofile-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oprofile-list
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic