[prev in list] [next in list] [prev in thread] [next in thread] 

List:       grid-engine-dev
Subject:    Re: [GE dev] Design error in GE 6 thread handling
From:       Nick Maclaren <nmm1 () cus ! cam ! ac ! uk>
Date:       2004-10-06 21:23:22
Message-ID: E1CFJFq-0001Kh-Uo () virgo ! cus ! cam ! ac ! uk
[Download RAW message or body]

> It was not possible for me to find out any problem by setting
> 
> "ulimit -v 2000000"
> 
> but I could produce a segmentation fault error with
> 
> "ulimit -v 12000"

Odd.  I may try taking a look at libraries, malloc, shmem etc., as
that could be an oddity of certain configuration options.  If so,
it could affect other programs.  Failing with 12000 is a little more
reasonable than failing with 2000000!

> No commlib function should be called without successful running of 
> cl_com_setup_commlib() !
>             
> By checking the correct behaviour of the qmaster initialization I found 
> out, that the result
> from cl_com_setup_commlib() is ignored and the qmaster continues just 
> reporting an error, but
> the correct behaviour is to stop program execution.

Ah.  The code that I looked at didn't exactly ignore it, but called
the various cleanup routines.  And it was those that were bombing
out, for the reason I described.

Yes, fixing it to stop immediately is the easiest (and probably most
reliable) solution.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic