[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ltp-list
Subject:    Re: [LTP] [PATCH] Re:cli/sti vs local_cmpxchg and local_add_return
From:       Mathieu Desnoyers <mathieu.desnoyers () polymtl ! ca>
Date:       2009-03-31 16:11:26
Message-ID: 20090331161126.GC25849 () Krystal
[Download RAW message or body]

* Subrata Modak (subrata@linux.vnet.ibm.com) wrote:
> Hi Mathieu,
> 
> On Wed, 2009-03-18 at 11:29 +0530, Subrata Modak wrote:
> Hi Mathieu,
> > 
> > On Tue, 2009-03-17 at 11:41 -0400, Mathieu Desnoyers wrote:
> > > * Subrata Modak (tosubrata@gmail.com) wrote:
> > > > Hi Mathieu,
> > > > 
> > > > On Tue, Mar 17, 2009 at 7:02 AM, Mathieu Desnoyers <
> > > > mathieu.desnoyers@polymtl.ca> wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > I am trying to get access to some non-x86 hardware to run some atomic
> > > > > primitive benchmarks for a paper on LTTng I am preparing. That should be
> > > > > useful to argue about performance benefit of per-cpu atomic operations
> > > > > vs interrupt disabling. I would like to run the following benchmark
> > > > > module on CONFIG_SMP :
> > > > > 
> > > > > - PowerPC
> > > > > - MIPS
> > > > > - ia64
> > > > > - alpha
> > > > > 
> > > > > usage :
> > > > > make
> > > > > insmod test-cmpxchg-nolock.ko
> > > > > insmod: error inserting 'test-cmpxchg-nolock.ko': -1 Resource temporarily
> > > > > unavailable
> > > > > dmesg (see dmesg output)
> > > > > 
> > > > 
> > > > With your permission, can we include this test in LTP (
> > > > http://ltp.sourceforge.net/), in some appropriate place as a small benchmark
> > > > test ?
> > > > 
> > > 
> > > Hi Subrata,
> > > 
> > > Sure, maybe you'll want to use a better interface than a module init
> > > that fails though. :)
> > 
> > Please Cc me when you come up with a better interface. Meanwhile, i will
> > find out a better way to integrate this with LTP and will notify you
> > when i do that. Thanks.
> 
> How about the following simple patch ? This will integrate it to LTP.
> 
> Nemeth,
> 
> Comments ?
> 

It looks fine, as long as it is OK with LTP standards. Please feel free
to add it to LTP.

Thanks,

Mathieu

> > > 
> > > Mathieu
> > > 
> > > > Regards--
> > > > Subrata
> > > > 
> > > > 
> > > > > If some of you would be kind enough to run my test module provided below
> > > > > and provide the results of these tests on a recent kernel (2.6.26~2.6.29
> > > > > should be good) along with their cpuinfo, I would greatly appreciate.
> > > > > 
> > > > > Here are the CAS results for various Intel-based architectures :
> > > > > 
> > > > > Architecture         | Speedup                      |      CAS     |
> > > > > Interrupts         |
> > > > > > (cli + sti) / local cmpxchg  | local | sync | Enable
> > > > > (sti) | Disable (cli)
> > > > > 
> > > > > -------------------------------------------------------------------------------------------------
> > > > >  Intel Pentium 4      | 5.24                         |  25   | 81   | 70
> > > > > > 61          |
> > > > > AMD Athlon(tm)64 X2  | 4.57                         |  7    | 17   | 17
> > > > > > 15          |
> > > > > Intel Core2          | 6.33                         |  6    | 30   | 20
> > > > > > 18          |
> > > > > Intel Xeon E5405     | 5.25                         |  8    | 24   | 20
> > > > > > 22          |
> > > > > 
> > > > > The benefit expected on PowerPC, ia64 and alpha should principally come
> > > > > from removed memory barriers in the local primitives.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Mathieu
> > > > > 
> > > > > P.S. please forgive the coding style and hackish interface. :)
> > > > > 
> ---
> 
> --- ltp-full-20090331.orig/testcases/kernel/device-drivers/misc_modules/per_cpu_atomic_operations_vs_interrupt_disabling_module/Makefile	1970-01-01 \
>                 05:30:00.000000000 +0530
> +++ ltp-full-20090331/testcases/kernel/device-drivers/misc_modules/per_cpu_atomic_operations_vs_interrupt_disabling_module/Makefile	2009-03-31 \
> 20:33:16.000000000 +0530 @@ -0,0 +1,20 @@
> +ifneq ($(KERNELRELEASE),)
> +         obj-m += test-cmpxchg-nolock.o
> +else
> +KERNELDIR ?= /lib/modules/$(shell uname -r)/build
> +PWD := $(shell pwd)
> +KERNELRELEASE = $(shell cat $(KERNELDIR)/$(KBUILD_OUTPUT)/include/linux/version.h \
> | sed -n 's/.*UTS_RELEASE.*\"\(.*\)\".*/\1/p') +ifneq ($(INSTALL_MOD_PATH),)
> +         DEPMOD_OPT := -b $(INSTALL_MOD_PATH)
> +endif
> +        
> +default:
> +	$(MAKE) -C $(KERNELDIR) M=$(PWD) modules
> +        
> +modules_install:
> +	$(MAKE) -C $(KERNELDIR) M=$(PWD) modules_install
> +	if [ -f $(KERNELDIR)/$(KBUILD_OUTPUT)/System.map ] ; then /sbin/depmod -ae -F \
> $(KERNELDIR)/$(KBUILD_OUTPUT)/System.map $(DEPMOD_OPT) $(KERNELRELEASE) ; fi +      \
>  +clean:
> +	$(MAKE) -C $(KERNELDIR) M=$(PWD) clean
> +endif
> --- ltp-full-20090331.orig/testcases/kernel/device-drivers/misc_modules/per_cpu_atomic_operations_vs_interrupt_disabling_module/test-cmpxchg-nolock.c	1970-01-01 \
>                 05:30:00.000000000 +0530
> +++ ltp-full-20090331/testcases/kernel/device-drivers/misc_modules/per_cpu_atomic_operations_vs_interrupt_disabling_module/test-cmpxchg-nolock.c	2009-03-31 \
> 20:34:04.000000000 +0530 @@ -0,0 +1,301 @@
> +/******************************************************************************/
> +/*                                                                            */
> +/* Copyright (c) Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>, 2009       */
> +/*                                                                            */
> +/* This program is free software;  you can redistribute it and/or modify      */
> +/* it under the terms of the GNU General Public License as published by       */
> +/* the Free Software Foundation; either version 2 of the License, or          */
> +/* (at your option) any later version.                                        */
> +/*                                                                            */
> +/* This program is distributed in the hope that it will be useful,            */
> +/* but WITHOUT ANY WARRANTY;  without even the implied warranty of            */
> +/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See                  */
> +/* the GNU General Public License for more details.                           */
> +/*                                                                            */
> +/* You should have received a copy of the GNU General Public License          */
> +/* along with this program;  if not, write to the Free Software               */
> +/* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA    */
> +/*                                                                            */
> +/* usage :
> +        make
> +        insmod test-cmpxchg-nolock.ko
> +        insmod: error inserting 'test-cmpxchg-nolock.ko': 
> +                -1 Resource temporarily unavailable
> +        dmesg (see dmesg output)                                              */
> +/******************************************************************************/
> +
> +
> +
> +/* test-cmpxchg-nolock.c
> +*
> +* Compare local cmpxchg with irq disable / enable.
> +*/
> +
> +
> +#include <linux/jiffies.h>
> +#include <linux/compiler.h>
> +#include <linux/init.h>
> +#include <linux/module.h>
> +#include <linux/math64.h>
> +#include <asm/timex.h>
> +#include <asm/system.h>
> +
> +#define NR_LOOPS 20000
> +
> +int test_val;
> +
> +static void do_testbaseline(void)
> +{
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       asm volatile ("");
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for baseline\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> baseline takes %llu cycles\n", time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +static void do_test_sync_cmpxchg(void)
> +{
> +       int ret;
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +#ifdef CONFIG_X86_32
> +       ret = sync_cmpxchg(&test_val, 0, 0);
> +#else
> +       ret = cmpxchg(&test_val, 0, 0);
> +#endif
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for locked cmpxchg\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> locked cmpxchg takes %llu cycles\n", time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +static void do_test_cmpxchg(void)
> +{
> +       int ret;
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       ret = cmpxchg_local(&test_val, 0, 0);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for non locked cmpxchg\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> non locked cmpxchg takes %llu cycles\n", time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +static void do_test_sync_inc(void)
> +{
> +       int ret;
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +       atomic_t val;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       ret = atomic_add_return(10, &val);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for locked add return\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> locked add return takes %llu cycles\n", time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +
> +static void do_test_inc(void)
> +{
> +       int ret;
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +       local_t loc_val;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       ret = local_add_return(10, &loc_val);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for non locked add return\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> non locked add return takes %llu cycles\n", time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +
> +
> +/*
> + * This test will have a higher standard deviation due to incoming interrupts.
> + */
> +static void do_test_enable_int(void)
> +{
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       local_irq_restore(flags);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for enabling interrupts (STI)\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> enabling interrupts (STI) takes %llu cycles\n",
> +       time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +static void do_test_disable_int(void)
> +{
> +       unsigned long flags, flags2;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for ( i = 0; i < NR_LOOPS; i++) {
> +       local_irq_save(flags2);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for disabling interrupts (CLI)\n");
> +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> disabling interrupts (CLI) takes %llu cycles\n",
> +       time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +static void do_test_int(void)
> +{
> +       unsigned long flags;
> +       unsigned int i;
> +       cycles_t time1, time2, time;
> +       u32 rem;
> +
> +       local_irq_save(flags);
> +       preempt_disable();
> +       time1 = get_cycles();
> +       for (i = 0; i < NR_LOOPS; i++) {
> +       local_irq_restore(flags);
> +       local_irq_save(flags);
> +       }
> +       time2 = get_cycles();
> +       local_irq_restore(flags);
> +       preempt_enable();
> +       time = time2 - time1;
> +
> +       printk(KERN_ALERT "test results: time for disabling/enabling interrupts \
> (STI/CLI)\n"); +       printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
> +       printk(KERN_ALERT "total time: %llu\n", time);
> +       time = div_u64_rem(time, NR_LOOPS, &rem);
> +       printk(KERN_ALERT "-> enabling/disabling interrupts (STI/CLI) takes %llu \
> cycles\n", +       time);
> +       printk(KERN_ALERT "test end\n");
> +}
> +
> +
> +
> +static int ltt_test_init(void)
> +{
> +       printk(KERN_ALERT "test init\n");
> +
> +       do_testbaseline();
> +       do_test_sync_cmpxchg();
> +       do_test_cmpxchg();
> +       do_test_sync_inc();
> +       do_test_inc();
> +       do_test_enable_int();
> +       do_test_disable_int();
> +       do_test_int();
> +       return -EAGAIN; /* Fail will directly unload the module */
> +}
> +
> +static void ltt_test_exit(void)
> +{
> +       printk(KERN_ALERT "test exit\n");
> +}
> +
> +module_init(ltt_test_init)
> +module_exit(ltt_test_exit)
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Mathieu Desnoyers");
> +MODULE_DESCRIPTION("Cmpxchg vs int Test");
> 
> ---
> Regards--
> Subrata
> 
> > 
> > Regards--
> > Subrata
> > 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

------------------------------------------------------------------------------
_______________________________________________
Ltp-list mailing list
Ltp-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ltp-list


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic