[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR(xxxs) (jdk10): 8168542: os::realloc should return a valid pointer for input size=0
From:       Thomas_Stüfe <thomas.stuefe () gmail ! com>
Date:       2017-02-28 16:03:56
Message-ID: CAA-vtUxiAFC_N-ay1+i723sMbMT2mFy0kvsnx2YdV0Q-oyzXqA () mail ! gmail ! com
[Download RAW message or body]

Hi Dmitry,

On Tue, Feb 28, 2017 at 4:47 PM, Dmitry Samersoff <
dmitry.samersoff@oracle.com> wrote:

> Thomas,
>
> I agree that malloc behavior should be exactly the same as realloc one.
> So if we can't assert size==0 for both, lets go with the first patch.
>
>
Thank you.


> But IMHO, zero-size allocation is not a good pattern and we should fix
> it if possible. Of course, not within scope of this fix.
>
> I'm not sure wether NMT can help us to track zero allocations. After the
> fix NMT will record 1 byte allocation instead.
>
>
Sure. Some ideas:

- add another MEMFLAG enum value like "size_0_allocation" and replace the
caller's memflag enum value with this one (a bit ugly but easy to do).

- a bit more elaborate: with size==0, do not allocate at all but return a
known sentinel pointer value, e.g. "0xABCDABCD". On os::free(), recognize
this sentinel value as a former size-0 allocation and just ignore it. NMT
gets the sentinel handed down as address and should recognize it too, just
counting size-0 allocations or printing the callstacks or keeping an extra
list of these callstacks... Added bonus: if the sentinel value points to
memory which is guaranteed to be invalid, we crash immediately if the
caller does access the memory (so, if his notion of the size is different
from 0, a real error).

free(NULL) is a different story and I don't think it's a problem.
>
>
Pretty sure I already saw free(NULL) in the wild which were the result of
an error - pointer freed was from a sparsely populated pointer array which
was accessed via a wrong index. But I think free(NULL) is used very often
in innocent situations, so I agree. I would not assert here.

Kind regards, Thomas


> -Dmitry
>
> On 2017-02-27 14:20, Thomas Stüfe wrote:
> > Hi all,
> >
> > thanks for all your input, and sorry for the delay!
> >
> > So, realloc. To summarize our discussion:
> >
> > There seems to be a slight preference toward treating allocations with
> > size==0 as an error which should yield an assert. Both Chris and me see
> the
> > size=1 approach as a pragmatic way to deal with this situation. But I can
> > see both sides, so as a test I implemented another approach:
> >
> > http://cr.openjdk.java.net/~stuefe/webrevs/8168542-os_
> > realloc_size_0/jdk10-webrev.02-alternate-version/webrev/
> >
> > Now this is a very trivial change. I assert on size=0 for both malloc and
> > realloc. I decided to be more tolerant on the release build and return a
> > 1byte sized block (yes, it is actually more) to the caller. Returning
> NULL
> > is a bad idea, because no-one checks the errno, everyone just assumes
> OOM.
> >
> > I built slowdebug and ran it through hotspot jtreg tests on my Linux box.
> > It yielded ~80 asserts (for malloc(size=0), no assert for
> realloc(size=0))
> >
> > They boil down to ~10 issues, all of which a variation of the same theme.
> > Coding wants to do something with a collection of n items - allocates an
> > array of size n for temporary data, iterates over the items, then frees
> the
> > array again. If n=0, code degenerates to one malloc, followed by free.
> Note
> > that the fact that n is 0 is often a result of the test itself testing
> > corner cases.
> >
> > I honestly do not consider these occurrences errors. Yes, the
> > malloc()/free() could have been skipped, but on the other hand the
> > programmers may have relied knowingly on the established behavior of
> > os::malloc() returning a freeable pointer for size=0. I attempted to
> "fix"
> > some of the occurrences by skipping the malloc() if n was zero, but in
> all
> > cases the coding ended up less readable than before. (Sorry, I cannot
> post
> > this fix attempt, I accidentally deleted my patch).
> >
> > Here are some examples of asserts:
> >
> > --
> >
> > 1) 33 V  [libjvm.so+0xb17625]  G1VerifyYoungCSetIndicesClosure::
> > G1VerifyYoungCSetIndicesClosure(unsigned long)+0xbb
> >  34 V  [libjvm.so+0xb16588]  G1CollectionSet::verify_young_
> cset_indices()
> > const+0x1a8
> >
> > Happens in a lot of tests. G1CollectionSet::verify_young_cset_indices()
> is
> > called when G1CollectionSet::_collection_set_cur_length is zero. I
> assume
> > this is fine, verification could just be skipped at this point, so the
> fix
> > would be trivial.
> >
> > --
> >
> > gc/arguments/TestG1ConcRefinementThreads: VM is called with
> > -XX:G1ConcRefinementThreads=0. Then, we assert here:
> >
> >  32 V  [libjvm.so+0x5adeee]  AllocateHeap(unsigned long, MemoryType,
> > NativeCallStack const&, AllocFa-XX:G1ConcRefinementThreads=
> > 0ilStrategy::AllocFailEnum)+0x3b
> >  33 V  [libjvm.so+0x954689]  ConcurrentG1Refine::create(
> CardTableEntryClosure*,
> > int*)+0x197
> >  34 V  [libjvm.so+0xaf596a]  G1CollectedHeap::initialize()+0x1a4
> >  35 V  [libjvm.so+0x127e940]  Universe::initialize_heap()+0x62
> >
> > Not sure at which layer one would handle this. If
> > -XX:G1ConcRefinementThreads=0 makes no sense as an option, should this
> > setting not just be forbidden?
> >
> > --
> > At a number of tests linux numa initialization failed:
> >
> >  32 V  [libjvm.so+0x5adeee]  AllocateHeap(unsigned long, MemoryType,
> > NativeCallStack const&, AllocFailStrategy::AllocFailEnum)+0x3b
> >  33 V  [libjvm.so+0xbf7aff]  GenericGrowableArray::raw_
> allocate(int)+0x12f
> >
> >
> >  34 V  [libjvm.so+0x6b6b89]  GrowableArray<int>::GrowableArray(int,
> bool,
> > MemoryType)+0x65
> >  35 V  [libjvm.so+0x105b294]  os::Linux::libnuma_init()+0x17e
> >
> > Here, a GrowableArray is initialized with size = 0. This seems to be an
> > allowed pattern, so GrowableArray would have to be fixed to delay the
> > malloc until the first real allocation.
> >
> > --
> >
> > So, options:
> >
> > - we go this way - all the occurrences would have to be fixed, and
> > programmers would have to be made aware of the changed behavior of
> > os::malloc(size=0).
> > - we could only assert for realloc(size=0), but I find this even more
> > inconsistent than it is now.
> > - we could rethink the original decision.
> >
> > More things to ponder:
> >
> > - what about free(NULL), do we want to assert here too? On one hand, this
> > may indicate an error, possibly a double free(). On the other hand,
> passing
> > NULL to free() is even more a known established pattern than passing
> size=0
> > to malloc, and programmers may just do this expecting os::free() to have
> > the same semantics as ::free().
> >
> > - this patch relies on the fact that we have control about who calls
> > os::malloc/os::free (similar to the decision to use malloc headers in
> both
> > NMT and os.cpp). Should we ever open up our malloc/free calls to the
> > outside world, we may loose this control and would have to make sure that
> > os::malloc/os::free behave like ::malloc/::free. Why would we open
> > os::malloc to the outside? For instance, to get NMT malloc coverage over
> > the JDK native libraries. Granted, a lot more would have to change to
> > enable this. (Note that we have a malloc statistic in place in our fork
> > which spans hotspot + native jdk, and we do this by calling os::malloc
> from
> > the jdk native code).
> >
> >
> > Kind Regards, Thomas
> >
> >
> >
> >
> > On Wed, Feb 8, 2017 at 10:01 AM, Robbin Ehn <robbin.ehn@oracle.com>
> wrote:
> >
> >> Hi,
> >>
> >> On 02/07/2017 02:24 AM, Chris Plummer wrote:
> >>
> >>> I don't think we can push a change to assert when size == 0 without
> first
> >>> thoroughly testing it and fixing any places that currently do that.
> Testing
> >>> may turn up nothing,
> >>> or it may turn up a rat's nest. Just something to think about before
> >>> starting down this path.
> >>>
> >>
> >> Since there is a while since jdk10 will be GA, now is the best time.
> >>
> >>
> >>> I'm not as offended by the size 1 approach as others, and it is what we
> >>> already do for malloc(). It probably  makes sense to have malloc and
> >>> realloc be consistent in this
> >>> regard, regardless of what the native version do by default (which can
> >>> vary). Just given the fact that we are even having this discussion and
> >>> debating which is best tends
> >>> to make me think that caller's of malloc() and realloc() either didn't
> >>> even give this topic any thought, or possibly came to differing
> conclusions
> >>> on whether or not it
> >>> mattered if they passed in a size == 0.
> >>>
> >>
> >> I agree that realloc and malloc should do the same.
> >>
> >> IEEE Std 1003.1, 2004 Edition says:
> >> "If size is 0 and ptr is not a null pointer, the object pointed to is
> >> freed."
> >>
> >> So as I said in previously mail, there is nothing wrong with doing
> >> realloc(ptr, 0), but I also don't think we should do it hotspot.
> >> Doing malloc(0) is undefined, so this really should be at least an
> assert.
> >>
> >> I'm all favor assert for both realloc and malloc.
> >>
> >> I can do pre-testing, KS, jtreg, tonga or whatever we think is
> appropriate.
> >> (if this is a rat's nest we can change our minds...)
> >>
> >> /Robbin
> >>
> >>
> >>
> >>
> >>> Chris
> >>>
> >>> On 2/6/17 3:49 AM, Dmitry Samersoff wrote:
> >>>
> >>>> Thomas,
> >>>>
> >>>> I share all David's concerns.
> >>>>
> >>>> Both Linux and BSD return NULL if size for realloc is zero.
> >>>>
> >>>> IMHO, if we call realloc with size == 0 it means a program error on
> >>>> higher level and we should fix it rather than hide.
> >>>>
> >>>> So it might be better to assert size > 0 inside realloc and return
> NULL.
> >>>>
> >>>> -Dmitry
> >>>>
> >>>>
> >>>> On 2017-02-05 13:56, Thomas Stüfe wrote:
> >>>>
> >>>>> Hi David,
> >>>>>
> >>>>> some context: the whole issue came into being because for a long
> time we
> >>>>> had exchanged os::malloc/os::realloc with our own versions for malloc
> >>>>> tracking purposes. This was way before NMT.
> >>>>>
> >>>>> When writing the os::malloc/realloc replacements and associated
> tests,
> >>>>> we
> >>>>> took care to mimick the behavior of native malloc/realloc in a
> >>>>> consistent
> >>>>> matter. That way we could exchange malloc calls with my os::malloc
> on a
> >>>>> case by case base without behavior change (or put it differently,
> >>>>> nothing
> >>>>> would break if I accidentally miss an instrumentation in code and
> leave
> >>>>> it
> >>>>> a native malloc/realloc)
> >>>>>
> >>>>> Now we went back to the OpenJDK version of
> os::malloc()/os::realloc() to
> >>>>> reduce maintenance, and my cornercase tests fail. Hence this bug
> report.
> >>>>>
> >>>>> Please find further answers inline.
> >>>>>
> >>>>> On Sun, Feb 5, 2017 at 3:50 AM, David Holmes <
> david.holmes@oracle.com>
> >>>>> wrote:
> >>>>>
> >>>>> Hi Thomas,
> >>>>>>
> >>>>>> On 4/02/2017 10:09 PM, Thomas Stüfe wrote:
> >>>>>>
> >>>>>> Hi guys,
> >>>>>>>
> >>>>>>> picking up this trivial issue which got pushed to jdk10, could I
> >>>>>>> please
> >>>>>>> have a re-review? Oh, also a sponsor.
> >>>>>>>
> >>>>>>> Issue:
> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8168542
> >>>>>>>
> >>>>>>> webrev:
> >>>>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8168542-os_reallo
> >>>>>>> c_size_0/jdk10-webrev.01/webrev/
> >>>>>>>
> >>>>>>> For reference, the old discussion back in october 16:
> >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2
> >>>>>>> 016-October/021675.html
> >>>>>>>
> >>>>>>> I am still rather confused by both existing code and proposed
> change
> >>>>>> - the
> >>>>>> code just seems wrong if passed in a zero size. Passing in zero is a
> >>>>>> bug
> >>>>>> AFAICS and if buggy code passes in zero then how do we know what it
> >>>>>> will do
> >>>>>> with the returned value from os::realloc ?
> >>>>>>
> >>>>>
> >>>>>
> >>>>> Currently, in the OpenJDK version, we have the following behavior:
> >>>>>
> >>>>> 1 ASSERT
> >>>>>   a os::malloc
> >>>>> size==0: return valid pointer to 1 byte array
> >>>>>   b os::realloc
> >>>>> ptr != NULL && size==0: return NULL
> >>>>>
> >>>>> 2 !ASSERT
> >>>>>   a      os::malloc
> >>>>> size==0: return valid pointer to 1 byte array
> >>>>>   b      os::realloc
> >>>>> ptr != NULL && size==0: whatever the native realloc does, which may
> be
> >>>>> returning NULL or returning a valid allocation.
> >>>>>
> >>>>> This is inconsistent.
> >>>>>
> >>>>> A) 1a and 1b, although arguably similar cases, behave differently.
> >>>>> B) 1b and 2b may behave differently, depending on the behaviour of
> the
> >>>>> native CRT calls. Which is IMHO worse than case (A). You do not want
> >>>>> different behaviour between ASSERT and !ASSERT.
> >>>>>
> >>>>> Additionally, for (B) it is not good that we delegate cornercase
> >>>>> behavior
> >>>>> to the native CRT, because that may lead to os::realloc() to have
> >>>>> different
> >>>>> behavior between platforms. On one platform, ::realloc(ptr,0) may
> return
> >>>>> NULL, on another it may return a pointer. So you get platform
> dependent
> >>>>> errors.
> >>>>>
> >>>>> My patch proposes to change the behavior like this:
> >>>>>
> >>>>> 1 ASSERT
> >>>>>   a os::malloc
> >>>>> size==0: return valid pointer to 1 byte array
> >>>>>   b os::realloc
> >>>>> ptr != NULL && size==0: return valid pointer to 1 byte array
> >>>>>
> >>>>> 2 !ASSERT
> >>>>>   a      os::malloc
> >>>>> size==0: return valid pointer to 1 byte array
> >>>>>   b      os::realloc
> >>>>> ptr != NULL && size==0: return valid pointer to 1 byte array
> >>>>>
> >>>>> This is more consistent.
> >>>>>
> >>>>> -----------
> >>>>>
> >>>>> Aside from the consistency issue:
> >>>>>
> >>>>> I think if someone passes 0 as resize size, this may mean he
> calculated
> >>>>> the
> >>>>> resize size wrong. Note that this does not necessarily mean a fatal
> >>>>> error,
> >>>>> if he has the correct notion of buffer size - 0 - and does not
> overwrite
> >>>>> anything, nothing bad will happen.
> >>>>>
> >>>>> In this situation you have three choices
> >>>>>
> >>>>> 1) return NULL. Caller will either incorrectly access the pointer and
> >>>>> crash, or test the pointer and all cases I saw then assume OOM. Also
> >>>>> bad.
> >>>>>
> >>>>> 2) assert. This just seems plain wrong. Inconsistent with CRT
> behavior,
> >>>>> and
> >>>>> what do you do in release code?
> >>>>>
> >>>>> 3) return a 1 byte allocation. That is what I propose. This may in
> the
> >>>>> worst case lead to memory leaks; but only if caller expected the
> >>>>> behaviour
> >>>>> of os::realloc(ptr, 0) to be "free ptr and return NULL", which I have
> >>>>> seen
> >>>>> nowhere. If caller is oblivious to giving us size 0, he will be
> freeing
> >>>>> the
> >>>>> memory later.
> >>>>>
> >>>>> (Note that you could easily prevent memory leaks by just returning a
> >>>>> special global static sentinel pointer for size==0, but you have to
> be
> >>>>> sure
> >>>>> that all os::realloc calls are matched by os::free calls.)
> >>>>>
> >>>>> I see your point about treating size==0 as an error. But I am not a
> big
> >>>>> fan, for the mentioned reasons, and would prefer os::realloc() to
> behave
> >>>>> like native realloc.
> >>>>>
> >>>>> But if we follow your suggestion and make size==0 an assertable
> >>>>> offense, we
> >>>>> should do so too for malloc(0).
> >>>>>
> >>>>> Kind Regards, Thomas
> >>>>>
> >>>>>
> >>>>>
> >>>>> David
> >>>>>
> >>>>>>
> >>>>>> The webrev is unchanged from the proposal for jdk9, just rebased to
> >>>>>>
> >>>>>>> jdk10/hs.
> >>>>>>>
> >>>>>>> @Chris, I decided to not follow your suggestion to move the
> comparison
> >>>>>>> into the #ifndef assert. You are right, this is redundant with the
> >>>>>>> check
> >>>>>>> inside os::malloc(), but as you said, checking at the entry of
> >>>>>>> os::realloc() makes it clearer.
> >>>>>>>
> >>>>>>> Thank you!
> >>>>>>>
> >>>>>>> Kind Regards, Thomas
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>>
>
>
> --
> Dmitry Samersoff
> Oracle Java development team, Saint Petersburg, Russia
> * I would love to change the world, but they won't give me the sources.
>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic