[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jakarta-commons-dev
Subject:    Re: [commons-lang3] potential bug in CharSequenceUtils?
From:       Xeno Amess <xenoamess () gmail ! com>
Date:       2020-04-29 13:05:26
Message-ID: CAFF4x5LexkmNqxOdSPVcSTMmifznheamJ+OsSCrUYYvdX=1WiA () mail ! gmail ! com
[Download RAW message or body]


yes it is really a bug.
I created a fix pr (with test codes) at
https://github.com/apache/commons-lang/pull/529
check in it when you guys have time.


Xeno Amess <xenoamess@gmail.com> 于2020年4月29日周三 上午5:04写道:

> well when I look at StringUtil I found something like this.
>
> final char c1 = cs.charAt(index1++);
> final char c2 = substring.charAt(index2++);
>
> if (c1 == c2) {
>     continue;
> }
>
> if (!ignoreCase) {
>     return false;
> }
>
> // The same check as in String.regionMatches():
> if (Character.toUpperCase(c1) != Character.toUpperCase(c2)
>         && Character.toLowerCase(c1) != Character.toLowerCase(c2)) {
>     return false;
> }
>
> But it actually is not quite same to what in String.regionMatches.
> the code part in String.regionMatches. in JKD8 is actually
>
> char c1 = ta[to++];
> char c2 = pa[po++];
> if (c1 == c2) {
>     continue;
> }
> if (ignoreCase) {
>     // If characters don't match but case may be ignored,
>     // try converting both characters to uppercase.
>     // If the results match, then the comparison scan should
>     // continue.
>     char u1 = Character.toUpperCase(c1);
>     char u2 = Character.toUpperCase(c2);
>     if (u1 == u2) {
>         continue;
>     }
>     // Unfortunately, conversion to uppercase does not work properly
>     // for the Georgian alphabet, which has strange rules about case
>     // conversion.  So we need to make one last check before
>     // exiting.
>     if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
>         continue;
>     }
> }
>
> see, the chars to invoke Character.toLowerCase is actually u1 and u2, but
> according to logic  in CharSequenceUtils they should be c1 and c2.
> If they are functional equal, then why oracle guys create the two
> variables u1 and u2? That is a waste of time then.
> So I think it might be a bug.
> But me myself know nothing about Georgian.
> Is there anybody familiar with Georgian alphabet and willing to do further
> debug about this?
>
>
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic