[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR: 2178143: VM crashes if the number of bound CPUs changed during runtime
From:       David Holmes <david.holmes () oracle ! com>
Date:       2013-03-28 5:31:29
Message-ID: 5153D5B1.9080802 () oracle ! com
[Download RAW message or body]

So if I'm reading this right:

+   if (AssumeMP && !UseSerialGC) {
+     if (FLAG_IS_DEFAULT(ParallelGCThreads) && ParallelGCThreads == 1) {

The FLAG_IS_DEFAULT will be true if -XX:ParallelGCThreads=N was not 
specified, even though ParallelGCThreads was subsequently set 
programmatically to match the number of processors. So we will only warn if:
- AssumeMP is true
- we're not using serial GC
- the user didn't set -XX:ParallelGCThreads
- but parallelGCThreads ==1 (indicating one core available)

That sounds like what I wanted to, but was unable to, express.

Thanks,
David

On 28/03/2013 5:40 AM, Yumin Qi wrote:
> new webrev at same link:
>
> http://cr.openjdk.java.net/~minqi/2178143/
>
> Thanks
> Yumin
>
> On 3/27/2013 12:22 PM, Yumin Qi wrote:
>> Jon,
>>
>>   The warning is for a immediate attention, with PrintGCDetail, that
>> requires it set on  in command line, right?
>>
>>   I also found that with default setting (no GC flavor set), no
>> setting for ParallelGCThreads, it will be 0.
>>
>>   so with java -XX:+AssumeMP -version, it will give warning, this is
>> not right.
>>
>>     // Set per-collector flags
>>   if (UseParallelGC || UseParallelOldGC) {
>>     set_parallel_gc_flags();
>>   } else if (UseConcMarkSweepGC) { // should be done before ParNew
>> check below
>>     set_cms_and_parnew_gc_flags();
>>   } else if (UseParNewGC) {  // skipped if CMS is set above
>>     set_parnew_gc_flags();
>>   } else if (UseG1GC) {
>>     set_g1_gc_flags();
>>   }
>>   check_deprecated_gcs();
>>   check_deprecated_gc_flags();
>>
>>    With default GC (Btw, what is the default GC collector?),
>> ParallelGCThreads = 0.
>>
>>
>>
>>    So I will change code to:
>>
>>     if (AssumeMP && !UseSerialGC) {
>>     if (FLAG_IS_DEFAULT(ParallelGCThreads) && ParallelGCThreads == 1) {
>>       warning("If the number of processors is expected to increase
>> from one, then"
>>               " you should configure the number of parallel GC threads
>> appropriately"
>>               " using -XX:ParallelGCThreads=N");
>>     }
>>   }
>>
>> Will send another webrev after test.
>>
>> Thanks
>> Yumin
>>
>> On 3/27/2013 11:17 AM, Jon Masamitsu wrote:
>>> Yumin,
>>>
>>> How about adding PrintGCDetails to the guard?
>>>
>>>   if (AssumeMP && !UseSerialGC && PrintGCDetails) {.
>>>
>>> Jon
>>>
>>> On 3/27/2013 9:41 AM, Yumin Qi wrote:
>>>> so the warning should be
>>>>
>>>>   if (AssumeMP && !UseSerialGC) {
>>>>     if (FLAG_IS_DEFAULT(ParallelGCThreads) && ParallelGCThreads < 2) {
>>>>       warning("If the number of processors is expected to increase
>>>> from one, then"
>>>>               " you should configure the number of parallel GC
>>>> threads appropriately"
>>>>               " using -XX:ParallelGCThreads=N");
>>>>     }
>>>>   }
>>>>
>>>> So the warning will only be issued when AssumeMP turned on and the
>>>> ncpus =1. If user add cmdline -XX:ParallelGCThreads=1 which is not
>>>> the default value 0, so the warning will not be triggered. I will do
>>>> a test.
>>>>
>>>> Is this better?
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/26/2013 9:54 PM, David Holmes wrote:
>>>>> On 27/03/2013 2:06 PM, Yumin Qi wrote:
>>>>>> David,
>>>>>>
>>>>>> On 3/26/2013 7:49 PM, David Holmes wrote:
>>>>>>> Well now you have confused me. I thought checking
>>>>>>> FLAG_IS_DEFAULT(ParallelGCThreads) was the right thing to do (the
>>>>>>> user
>>>>>>> has not attempted manage the number of GC threads even though they
>>>>>>> recognize the number of processors may start at 1 and increase
>>>>>>> later).
>>>>>>> I do not understand the new <2 test - maybe I really only want 1,
>>>>>>> if I
>>>>>>> asked for that why should I get a warning ???
>>>>>>>
>>>>>> The previous code is issuing the warning before gc set flags
>>>>>> (includes
>>>>>> ParallelGCThreads), we don't want with AssumeMP turned on, running
>>>>>> on a
>>>>>> MP issues the warning since ParallelGCThreads will be not 1 but the
>>>>>> number of calculation based on ncpus.  So I moved the check code
>>>>>> after
>>>>>> GC set flags.  After GC set flags, ParallelGCThreads is not the
>>>>>> default
>>>>>> value (0) so I check if it is < 2.
>>>>>
>>>>> I was overlooking the case where you use +AssumeMP but have
>>>>> multiple processors anyway. But if I explicitly set
>>>>> ParallelGCThreads=1 then I don't expect to get a warning. Not sure
>>>>> we can distinguish these cases though.
>>>>>
>>>>> Again I leave this to you and Jon to finalize.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> If users really want to run with single CPU with -XX:+AssumeMP,
>>>>>> giving a
>>>>>> warning is reasonable (this is our intention here) --- there is
>>>>>> possibility they will increase number of cpus even they don't,
>>>>>> that is OK.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>> But, this is a GC warning (and I think it is misplaced anyway) so if
>>>>>>> Jon is happy with this behaviour so be it.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>> On 27/03/2013 5:44 AM, Yumin Qi wrote:
>>>>>>>> Sorry, copied wrong link. It is at same link.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>> On 3/26/2013 12:42 PM, Daniel D. Daugherty wrote:
>>>>>>>>>
>>>>>>>>> On 3/26/13 1:17 PM, Yumin Qi wrote:
>>>>>>>>>> Hi, Jon
>>>>>>>>>>
>>>>>>>>>>   I made a change to the warning condition and move it after
>>>>>>>>>> GC flags
>>>>>>>>>> are set. I think at this time, ParallelGCThreads has a value
>>>>>>>>>> != 0.
>>>>>>>>>> Please take a look if it is right.  The This is the new webrev:
>>>>>>>>>>
>>>>>>>>>> http://javaweb.us.oracle.com/~yqi/webrev/2178143/
>>>>>>>>>
>>>>>>>>> This webrev isn't accessible outside Oracle...
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   Tested with platform of ncpus > 1 and no ParallelGCThreads
>>>>>>>>>> set with
>>>>>>>>>> turning on AssumeMP.  The warning is gone.
>>>>>>>>>>    Also tested with ncpus = 1 and the warning is on too.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>>
>>>>>>>>>> On 3/26/2013 8:03 AM, Jon Masamitsu wrote:
>>>>>>>>>>> If I set AssumeMP on the command line and nothing else
>>>>>>>>>>> and am running on a platform where I get ParallelGCThreads > 1,
>>>>>>>>>>> am I going to see the warning?    If yes, that seems too
>>>>>>>>>>> intrusive.
>>>>>>>>>>> I can see users hearing about the flag and deciding to include
>>>>>>>>>>> it always just to be safe.
>>>>>>>>>>>
>>>>>>>>>>> Jon
>>>>>>>>>>>
>>>>>>>>>>> On 3/25/13 11:21 PM, Yumin Qi wrote:
>>>>>>>>>>>> You are right, thanks! I misread the INCLUDE.
>>>>>>>>>>>> new web in the same link.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Yumin
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/25/2013 11:00 PM, David Holmes wrote:
>>>>>>>>>>>>> On 26/03/2013 3:56 PM, Yumin Qi wrote:
>>>>>>>>>>>>>> David,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/25/2013 10:41 PM, David Holmes wrote:
>>>>>>>>>>>>>>> On 26/03/2013 3:30 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>>> David and All,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Changed as suggested by David.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David, I did not move the warning into
>>>>>>>>>>>>>>>> set_parallel_gc_flags()
>>>>>>>>>>>>>>>> since
>>>>>>>>>>>>>>>> except for serial gc, all other GCs will check this
>>>>>>>>>>>>>>>> flag, so I
>>>>>>>>>>>>>>>> moved it
>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>> #if INCLUDE_ALL_GCS
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> before call set GC flags.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I didn't realize all the other GCs would utilize
>>>>>>>>>>>>>>> ParallelGCThreads.
>>>>>>>>>>>>>>> Can we at least exclude it for serial GC case?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Serial GC is set here:
>>>>>>>>>>>>>
>>>>>>>>>>>>> INCLUDE_ALL_GCS allows all GCs including Serial GC.
>>>>>>>>>>>>> Otherwise we
>>>>>>>>>>>>> only have SerialGC. So you have excluded it for the case where
>>>>>>>>>>>>> SerialGC is the only GC built into the VM, but not for the
>>>>>>>>>>>>> case
>>>>>>>>>>>>> where all GCs are available and the user requests UseSerialGC.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3253 #if !INCLUDE_ALL_GCS
>>>>>>>>>>>>>> 3254   force_serial_gc();
>>>>>>>>>>>>>> 3255 #endif // INCLUDE_ALL_GCS
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So after it moved into
>>>>>>>>>>>>>> 3283 #if INCLUDE_ALL_GCS
>>>>>>>>>>>>>> 3284   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>>>>>>> 3285     warning("If the number of processors is expected to
>>>>>>>>>>>>>> increase
>>>>>>>>>>>>>> from one, then"
>>>>>>>>>>>>>> 3286             " you should configure the number of
>>>>>>>>>>>>>> parallel GC
>>>>>>>>>>>>>> threads appropriately"
>>>>>>>>>>>>>> 3287             " using -XX:ParallelGCThreads=N");
>>>>>>>>>>>>>> 3288   }
>>>>>>>>>>>>>> 3289   // Set per-collector flags
>>>>>>>>>>>>>> 3290   if (UseParallelGC || UseParallelOldGC) {
>>>>>>>>>>>>>> 3291     set_parallel_gc_flags();
>>>>>>>>>>>>>> 3292   } else if (UseConcMarkSweepGC) { // should be done
>>>>>>>>>>>>>> before
>>>>>>>>>>>>>> ParNew
>>>>>>>>>>>>>> check below
>>>>>>>>>>>>>> 3293     set_cms_and_parnew_gc_flags();
>>>>>>>>>>>>>> 3294   } else if (UseParNewGC) {  // skipped if CMS is set
>>>>>>>>>>>>>> above
>>>>>>>>>>>>>> 3295     set_parnew_gc_flags();
>>>>>>>>>>>>>> 3296   } else if (UseG1GC) {
>>>>>>>>>>>>>> 3297     set_g1_gc_flags();
>>>>>>>>>>>>>> 3298   }
>>>>>>>>>>>>>> 3299   check_deprecated_gcs();
>>>>>>>>>>>>>> 3300 #else // INCLUDE_ALL_GCS
>>>>>>>>>>>>>> 3301   assert(verify_serial_gc_flags(), "SerialGC unset");
>>>>>>>>>>>>>> 3302 #endif // INCLUDE_ALL_GCS
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is excluded for serial GC.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Otherwise looks good.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> new webrev: http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 3/25/2013 5:44 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have a few suggested changes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> First in os.hpp I suggest
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  return _processor_count > 1 || AssumeMP ;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> That way we don't bypass the assertion and don't encounter
>>>>>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>>>>>> overhead on the common path.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In arguments.cpp:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +   if (AssumeMP && FLAG_IS_DEFAULT(ParallelGCThreads)) {
>>>>>>>>>>>>>>>>> +     warning("With AssumeMP, recommend run with
>>>>>>>>>>>>>>>>> -XX:ParallelGCThreads=<num_of_gc_threads>, where"
>>>>>>>>>>>>>>>>> +             "  num_of_gc_threads can be set to number of
>>>>>>>>>>>>>>>>> cores");
>>>>>>>>>>>>>>>>> +   }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm not convinced issuing a warning is necessary or
>>>>>>>>>>>>>>>>> desirable
>>>>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>>>> With a uniprocessor startup will we even default to using
>>>>>>>>>>>>>>>>> parallel GC?
>>>>>>>>>>>>>>>>> The above should only be issued if using parallel GC - so
>>>>>>>>>>>>>>>>> better to
>>>>>>>>>>>>>>>>> move it into set_parallel_gc_flags().
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In any case it isn't clear what the user should supply
>>>>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>>>> If they
>>>>>>>>>>>>>>>>> use the maximum expected number of cores initially then
>>>>>>>>>>>>>>>>> while
>>>>>>>>>>>>>>>>> they
>>>>>>>>>>>>>>>>> only have 1 core their VM may not even function
>>>>>>>>>>>>>>>>> adequately due
>>>>>>>>>>>>>>>>> to GC
>>>>>>>>>>>>>>>>> overhead. I think all you can say here is something like:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> warning("If the number of processors is expected to
>>>>>>>>>>>>>>>>> increase
>>>>>>>>>>>>>>>>> from one,
>>>>>>>>>>>>>>>>> then you should configure the number of parallel GC
>>>>>>>>>>>>>>>>> threads
>>>>>>>>>>>>>>>>> appropriately using -XX:ParallelGCThreads=N");
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In globals.hpp:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +   product(bool, AssumeMP, false, \
>>>>>>>>>>>>>>>>> +           "Assume run on multiple processors always") \
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Was there any discussion on making this default to true
>>>>>>>>>>>>>>>>> instead?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Suggested re-wording:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  "Instruct the VM to assume multiple processors are
>>>>>>>>>>>>>>>>> available"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 26/03/2013 9:15 AM, Yumin Qi wrote:
>>>>>>>>>>>>>>>>>> It should be "AssumeMP".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /Yumin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 3/25/2013 3:32 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   New webrev to use "-XX:+AssumMP" flag. Also add
>>>>>>>>>>>>>>>>>>> warning to
>>>>>>>>>>>>>>>>>>> recommend
>>>>>>>>>>>>>>>>>>> use this flag with "-XX:ParallelGCThreads" to remind
>>>>>>>>>>>>>>>>>>> user to
>>>>>>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>>>>>>> running with only one GC thread. This is verified by
>>>>>>>>>>>>>>>>>>> Oleksandr with
>>>>>>>>>>>>>>>>>>> the test case running on Linux which is not Zone
>>>>>>>>>>>>>>>>>>> configured.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>   Same link.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 3/20/2013 2:27 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2178143:  VM crashes if the number of bound CPUs
>>>>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>>>>>>> runtime.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Situation: Customer first configure only one CPU
>>>>>>>>>>>>>>>>>>>> online and
>>>>>>>>>>>>>>>>>>>> turn
>>>>>>>>>>>>>>>>>>>> others offline to run java application, after java
>>>>>>>>>>>>>>>>>>>> program
>>>>>>>>>>>>>>>>>>>> started,
>>>>>>>>>>>>>>>>>>>> bring more CPUs back online. Since VM started on a
>>>>>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>>>>>> CPU,
>>>>>>>>>>>>>>>>>>>> os::is_MP() will return false, but after more CPUs
>>>>>>>>>>>>>>>>>>>> available, OS
>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>> schedule the app run on multiple CPUs, this caused
>>>>>>>>>>>>>>>>>>>> SEGV in
>>>>>>>>>>>>>>>>>>>> various
>>>>>>>>>>>>>>>>>>>> places where data consistency was broken. The
>>>>>>>>>>>>>>>>>>>> solution is
>>>>>>>>>>>>>>>>>>>> supply a
>>>>>>>>>>>>>>>>>>>> flag to assume it is running on MP, so lock is
>>>>>>>>>>>>>>>>>>>> forced to be
>>>>>>>>>>>>>>>>>>>> called.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/2178143/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>>
>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic