'Re: [groovy-user] Gpars making my code slower?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       groovy-user
Subject:    Re: [groovy-user] Gpars making my code slower?
From:       Sven Haiges <sven.haiges () googlemail ! com>
Date:       2009-10-28 22:16:21
Message-ID: 45ac42bc0910281516mff8a5ecv7b58848dfd2b9723 () mail ! gmail ! com
[Download RAW message or body]

Thx. I must admit I had not time so far, but I'll think about using
eachAsync in future.

Yes, the poll might be interesting.

Cheers
Sven

On Mon, Oct 26, 2009 at 4:42 AM, Vaclav Pech <vaclav.pech@seznam.cz> wrote:
> Hi Sven,
>
> sorry for having misguided you. The eachParallel() method is a renamed
> eachAsync() method and is only available in recent the gpars builds. So
> stick with eachAsync() for your code. The functionality is identical.
> BTW, thank you for the gpars poll at
> http://www.grailspodcast.com/blog/id/523 I am sure it will be pretty
> helpful.
>
> Cheers,
>
> Vaclav
>
>
> Sven Haiges wrote:
>>
>> Hi Vaclav,
>>
>> thanx for th info. I'll try that out and let you know. I was not using
>> eachParallel.. maybe that's the reason then.
>>
>> Cheers
>> Sven
>>
>> On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@seznam.cz>
>> wrote:
>>
>>>
>>> Hi Sven,
>>>
>>> the eachAsync() methods are supposed to work on all objects just like
>>> each()
>>> does, including strings or maps. The code below works just fine:
>>> Parallelizer.withParallelizer {
>>>  [a:1].eachParallel {println it.value}  //notice the method name change
>>> in
>>> recent gpars :)
>>> }
>>> Are you sure you invoke the eachAsync() within the 'withParallelizer'
>>> block?
>>> The withParallelizer block uses the Groovy category mechanism, so it only
>>> enhances the calling thread.
>>> Maybe you're trying to nest eachAsync()?
>>> withParallelizer {
>>>  images.eachAsync {
>>>      it.eachAsync()    //BANG! No eachAsync() here, it is a different
>>> tread, you have to enhance it too, perhaps using
>>> withExistingParallelizer(pool)
>>>  }
>>> }
>>>
>>> I wonder whether this is the case?
>>>
>>> Cheers,
>>>
>>> Vaclav
>>>
>>>
>>>
>>> Sven Haiges wrote:
>>>
>>>>
>>>> Hi Vaclav, all,
>>>>
>>>> why is there no eachAsync on a Map? I am just running into this gotcha
>>>> again, I parsed the site->file mapping into a map and then tried to
>>>> .eachAsync over it, but it seems only Collections are supported. Is
>>>> there a specific reason this is not implemented on the Maps? Or does
>>>> it exist and I just did not see it?
>>>>
>>>> Cheers
>>>> Sven
>>>>
>>>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges
>>>> <sven.haiges@googlemail.com>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> Hi Vaclav,
>>>>>
>>>>> thanx for the info. It seems I cleared my inbox too radically that's
>>>>> what I am replying that late. I'll be moving future questions into the
>>>>> gpars list but keeping this thread now.
>>>>>
>>>>> The bean access that I wanted to do in parallel and then the
>>>>> processing is for a classpath resource:
>>>>>
>>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>
>>>>> 'Site' will make sure I do not access the same bean twice, so there
>>>>> could only be some general spring init code that is touched by
>>>>> multiple threads and slows it down... interesting .
>>>>>
>>>>> In case this is interesting, the beans themselves look like this:
>>>>>
>>>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource,
>>>>> 'dtid_pv_ca') { }
>>>>>
>>>>> These are files of a couple kilobytes in size that I need to
>>>>> preprocess and load in some way into memory during startup time.
>>>>>
>>>>> I'll experiment a bit, like trying to access the beans first and then
>>>>> eachAsync'ing over them instead of the site values.
>>>>>
>>>>> Thanx
>>>>> Sven
>>>>>
>>>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@seznam.cz>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Hi Sven,
>>>>>>
>>>>>> I'm so glad you jumped into GPars, welcome on-board.
>>>>>> I measured the overhead on my a bit oldish dual-core to give you some
>>>>>> rough
>>>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing
>>>>>> a
>>>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and
>>>>>> calling
>>>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>>>>>> each(). All summed up means the direct overhead imposed by using the
>>>>>> pool in
>>>>>> your particular case could be around 280 ms. This makes me believe you
>>>>>> might
>>>>>> be able to get to about 700 ms for the parallel variant of your code
>>>>>> on
>>>>>> a
>>>>>> dual core. Not a big win compared to the 1000 ms sequential version,
>>>>>> in
>>>>>> my
>>>>>> opinion, but still much better than the numbers you're currently
>>>>>> getting.
>>>>>>
>>>>>> Since eachAsync behaves as expected and provides adequate speed up for
>>>>>> pure
>>>>>> CPU-intensive mutually independent calculations without any shared
>>>>>> state,
>>>>>> the trouble probably lies in the code you're calling in parallel. And
>>>>>> since
>>>>>> you actually see performance degradation with increasing thread count,
>>>>>> I'd
>>>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when
>>>>>> called
>>>>>> simultaneously from multiple threads. I could, for example, think of a
>>>>>> poorly done lazy initialization, which will be repeated for each
>>>>>> calling
>>>>>> thread, if the threads ask for a resource roughly at the same time.
>>>>>> It is quite difficult do go any further without actually touching the
>>>>>> code
>>>>>> and measuring each line individually. If you feel brave enough, you
>>>>>> may
>>>>>> try
>>>>>> to experiment with synchronization access to ctx or servletContext, or
>>>>>> preinitializing ctx before you call eachAsync.
>>>>>>
>>>>>> I hope this helps you move forward.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Vaclav
>>>>>>
>>>>>>
>>>>>> Sven Haiges wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> catchy title, I know, I also know the problem most likely is on my
>>>>>>> side but I need help to understand why.  Here is a piece of code,
>>>>>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>>>>>> I need to load a couple of classpath resources (referenced as spring
>>>>>>> beans) and load the data into maps which are then stored in the
>>>>>>> servletcontext.
>>>>>>>
>>>>>>> The code took 983ms to load all the data previously. I then began to
>>>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>>>>>> takes anywhere from double the time to even more:
>>>>>>>
>>>>>>> def site = ['a', 'b', 'c'] //about 33values here
>>>>>>>
>>>>>>>      def start = System.currentTimeMillis()
>>>>>>>      Parallelizer.withParallelizer(4) {
>>>>>>>          sites.eachAsync { site ->
>>>>>>>              def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>>>              def lines = dtid_imp_map.file.readLines()
>>>>>>>              def themap = [:]
>>>>>>>              lines.each { line ->
>>>>>>>                  def (dtid, pv) = line.split(',')
>>>>>>>                  themap[dtid] = pv.toLong()
>>>>>>>              }
>>>>>>>              servletContext.setAttribute("dtid_pv_map_${site}",
>>>>>>> themap)
>>>>>>>          }
>>>>>>>      }
>>>>>>>      println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>>>>>> - start}")
>>>>>>>
>>>>>>> I know that I am potentially accessing the servletContext from
>>>>>>> multiple threads the same time, but as the attribute name is never
>>>>>>> the
>>>>>>> same, I have hope there is no concurrency issue here... otherwise I
>>>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>>>>>> no special settings and took about twice as long (no setting = no of
>>>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800
>>>>>>> ms
>>>>>>> is is about 2secs longer than running this without any parallelizer
>>>>>>> code.
>>>>>>>
>>>>>>> So still, leaving my concurrency troubles behind, why is the
>>>>>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>>>>>> goiing takes too long? If that is the case, what is the typical time
>>>>>>> needed to get the Threads going? In my case that time is about 1000ms
>>>>>>> which sounds far off by my understanding... I think creating threads
>>>>>>> is cheaper :-)
>>>>>>>
>>>>>>> Cheers
>>>>>>> Sven
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe from this list, please visit:
>>>>>>
>>>>>>  http://xircles.codehaus.org/manage_email
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Sven Haiges
>>>>> sven.haiges@googlemail.com
>>>>>
>>>>> Yahoo Messenger / Skype: hansamann
>>>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>>>>>
>>>>> Subscribe to the Grails Podcast:
>>>>> http://www.grailspodcast.com
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this list, please visit:
>>>
>>>  http://xircles.codehaus.org/manage_email
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>



-- 
Sven Haiges
sven.haiges@googlemail.com

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic