[prev in list] [next in list] [prev in thread] [next in thread]
List: john-dev
Subject: [john-dev] Latin-1 to UTF-16 conversion (was Lei's GSoC progress)
From: Lei Zhang <zhanglei.april () gmail ! com>
Date: 2015-07-29 3:35:22
Message-ID: FDA6BA75-697E-4D59-9F58-17A6DB844713 () gmail ! com
[Download RAW message or body]
> On Jul 27, 2015, at 5:01 PM, magnum <john.magnum@hushmail.com> wrote:
>
> On 2015-07-27 03:15, Lei Zhang wrote:
> >
> > 1. The input key is appropriately padded in set_key() for the SIMD
> > SHA function, and key length is also determined in the process. What
> > do I do if the key is UTF16-encoded? In episerver(non-SIMD), it uses
> > enc_to_utf16() to convert the key and get its length. But each key is
> > not contiguously stored for the SIMD SHA function, thus
> > enc_to_utf16() won't be applicable.
>
> So episerver is sha256($s.utf16($p)) or sha1($s.utf16($p)). The MSSQL formats are \
> similar but appends salt instead of prepending (actually that's more tricky to \
> optimize since we can't keep the salt at a fixed position).
> For fast formats like this, flat enc_to_utf16() is far too slow. You should convert \
> right into SIMD buffer like in MSSQL05's set_key.
> Then you would just store the (bit-)length in the Merkel-Damgard buffer and be done \
> with it. You'd read it back in get_key when needed.
> You don't need it for anything else: For best performance, you should write the \
> salt right into SIMD buffer in set_salt() (repeated for all of the vector width of \
> course). The set_key and get_key functions will know there's a fixed salt length of \
> 16 (octets) so can just start writing/reading after it, and write (read) the bit \
> length with these extra 16 in mind. Then they'd write the Merkel-Damgard bit length \
> field as 8 * (16 + keylen) with keylen counted in octets...
> After all this, crypt_all() is simply just a matter of calling the SHA256 (or SHA1) \
> function - the buffer is ready to use.
I looked at set_key() in mssql05 and nt2, which both convert latin-1 to utf-16 into \
SIMD key buffer. Yet there're still some details I don't understand.
1. mssql05 uses SHA1 and nt2 uses MD4, both of which use the same padding scheme, \
except for the endianness of the padded length at the tail of the block. But their \
code for converting are somehow different,
e.g. in mssql05's set_key():
*keybuf_word = JOHNSWAP((temp << 16) | temp2);
and in nt2:
temp2 |= (temp << 16);
*keybuf_word = temp2;
Why is there no endianness swapping in nt2?
2. In mssql05's set_key():
unsigned int *keybuf_word = (unsigned int*)&saved_key[GETPOS(3, index)];
What's the intention of the number 3 here? Salts are appended to message in mssql05, \
so this is not for preserving space for salt. And the salt size is not 3 anyway.
BTW, there're so many hardcoded values in the code for SIMD buffer handling. This \
would cause a lot of headaches for a newcomer...
3. I see that the returned value in get_salt() and get_binary() are sometimes \
endianness-swapped for a SIMD build and sometimes not. What's the point here?
Thanks,
Lei=
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic