'Re: [ast-users] NaN and their payloads'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ast-users
Subject:    Re: [ast-users] NaN and their payloads
From:       Roland Mainz <roland.mainz () nrubsig ! org>
Date:       2013-11-18 0:13:57
Message-ID: CAKAoaQ=9gSQebaK5Q8cn6HaRc3O7ML6ajXTxi6ryTvHpFpwrAw () mail ! gmail ! com
[Download RAW message or body]

On Mon, Sep 23, 2013 at 8:00 AM, Roland Mainz <roland.mainz@nrubsig.org> wrote:
> On Thu, Sep 19, 2013 at 11:52 PM, Roland Mainz <roland.mainz@nrubsig.org> wrote:
>> On Thu, Sep 19, 2013 at 6:41 AM, Roland Mainz <roland.mainz@nrubsig.org> wrote:
>>> On Thu, Sep 5, 2013 at 5:45 AM, Roland Mainz <roland.mainz@nrubsig.org> wrote:
>>>> On Tue, Sep 3, 2013 at 2:46 PM, Glenn Fowler <gsf@research.att.com> wrote:
>>>>> On Tue, 3 Sep 2013 02:50:31 +0200 Tina Harriott wrote:
>>>>>> I'm currently looking into a complex mathematical simulation (fluid
>>>>>> dynamics) and found references to NaN (Not A Number) with payloads.
>>>>>> The Nan values are used to represent 'no data' in the datasets and the
>>>>>> payloads are used for context information. The code has almost 2
>>>>>> million lines of code and removing the feature would be a major
>>>>>> effort.
> [snip]
>>>> Attached (as "nan_payload_test1_20130904_001.c.txt") is some prototype
>>>> code which shows now to handle NaN payloads...
>>>>
>>>> * Notes:
>>>> - The code explicitly requires ISO C99 semantics
>>>> - The code relies on |__int128_t| to handle |long double|. Originally
>>>> I thought about handling the code using byte-by-byte masking... but it
>>>> turned out that this creates a lot of mess and is endian-specific. The
>>>> final code may have to resort to that to be portable but for now I
>>>> use |__int128_t| to keep the code readable
>>>
>>> Uhm... any feedback/suggestions ? Adding Nan payload support seems to
>>> be easy and doesn't come with a measureable performance penalty (since
>>> we only do extra work *if* there is a payload, otherwise most code has
>>> only "wrapper" overhead) ... if there are no objections I can try my
>>> luck and add this to ksh93's arithmetic engine...
>>
>> Attached (as "astksh20130913_nanpayload_prototype001.diff.txt") is a
>> _prototype_ for SuSE Linux 12.3/AMD64/32bit. 32bit will not work
>> because it lacks the |__int128_t| type... see below...
>>
>> Example usage:
>> -- snip --
>> $ ksh -c 'typeset -X x ; typeset -sX y ; float z ; ((
>> x=ast_nan(0xbeef) )) ; ((y=x, z=y)); print -- $((z))'
>> ast_nan(0xbeef)
>> -- snip --
>>
>> * Notes:
>> - This is a prototype
>> - The code was only tested on SuSE 12.3/AMD64/64bit with gcc4.7.
>> - Surprisingly the performance overhead is barely measureable...
>> basically we have a few extra |isnan()| calls or macros and that's it
>>
>> * ToDo:
>> - Put the nan payload functions into their own source file
>> - ABI issue: |nan("")| on some platforms only sets the highest hit in
>> the payload instead of 0xFFFFFFFF (e.g. set all bits). How should we
>> deal with it ?
>> - Keep the nan payload function names (*binary@(32|64|80|128)* but add
>> platform-specific macros to map them properly to SFDOUBLE, |long
>> double|, |double| and |float|). The seperation is important to keep
>> the #ifdef/#else/#endif hackery to a minimum and in one place
>> - How should we deal with the usage of |__int128_t| ? It's only
>> available on 64bit plaftorms with "clang" and newer "gcc" versions...
>> but writing a "portable" version which does the operations on a
>> byte/word/longword level makes the code depending on the endian-ness
>> of the platform. Main issue here is testing... my SPARC machine is
>> broken and I need some time to find money to fix it.
>
> Attached (as "astksh20130913_nanpayload_prototype005.diff.txt") is an
> updated patch for ast-ksh.2013-09-13 and
> "nan_payload_test1_20130922_001.c.txt" is standalone testcode for
> debugging purposes...
>
> * Notes:
> - This is a PROTOTYPE, the code is NOT ready yet. There is a lot of
> "spaghetti" which is caused by the issue that the master sources for
> the nan payload code live outside the AST tree for now... and there
> are unresolved issues (see below)
>
> - The API is *layered* intentionally: First layer uses
> |flt_binary@(32|64|80|128)|, 2nd layer uses ISO C names (|float|,
> |double|, |long double|). This is done to seperate logical from
> physical floating-point layouts... for example on some platforms |long
> double| is the same type as |double| and some platforms only know
> about |double| etc. etc. The other area of pain is x86/AMD64 with
> 80bit |long double| vs. other platforms with 128bit |long double|
>
> - we need to write an iffe probe for the nan payload stuff, which
> provides constants like number of bits, masks etc. and verifies that
> the |float|, |double| and |long double| behave *EXACTLY* as defined in
> IEEE754-2008. Some old implementations and software emulations take
> shortcuts... or have "magic bits" which can cause trouble (i387's
> IEEE754-1985-draft implementation with 80bit are such a special case).
> We can only support payloads if we have a
> fully-IEEE754-2008-conforming floating-point implementation (99% of
> the modern platforms should fall into this category)
>
> - x86/AMD84 i387 80bit floats are fully functional and have extra code
> to support it even if |__int128_t| is not supported
>
> - 128bit floats should be working now but require at least some testing
>
> - SPARC needs to be tested
>
> - We need per-floating-point datatype payload MIN/MAX constants
>
> - On some OSes like Solaris |nanf("")|/|nan("")|/|nanl("")| return a
> payload of 0xFFFFFFFFFF... (=all payload bits set) while the |NAN|
> contanst returns a nan value with all payload bits not set
>
> - AST |strtold()| on Solaris returns a value like
> |nanf("")|/|nan("")|/|nanl("")|, which causes trouble. Replacing it
> with |ast_nan()| causes sign reversal for (yet) unknown reasons (but
> the same happens with |sin(inf)| ... which may be a hint that
> something else is wrong in libast vs. nan sign handling) ... needs
> more debugging...
>
> - Glenn and I need to agree on a naming scheme for NaN's with payloads
> (gsf: We need to chat... issue is that not all naming schemes assume
> that the payload is an unsigned int). Currently we use
> "ast_nan(hexval)" in ksh93 and |strtof()|/|strtod()|/|strtold()|

Glenn: AFAIK I solved all technical issues (SPARC included) ... which
leaves the naming scheme as the only open question.
For now I would prefer to stick with |ast_nan(hexval)| and later
provide an env variable to switch between naming schemes (but always
accept |ast_nan(hexval)| as input).

What do you think ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz@nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
_______________________________________________
ast-users mailing list
ast-users@lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-users
[prev in list] [next in list] [prev in thread] [next in thread]