[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wine-devel
Subject:    Re: Unicode normalization for Wine
From:       Aric Stewart <aric () codeweavers ! com>
Date:       2017-07-26 12:38:01
Message-ID: 3e15f109-2c99-d032-5452-d087fec46bd9 () codeweavers ! com
[Download RAW message or body]

Hello,

On 7/25/17 4:33 PM, Artur Świgoń wrote:
> Dear All,
> 
> My name is Artur and I'm participating in Google Summer of Code 2017 for Wine.
> Under Nikolay's supervision, I'm working on implementation of Unicode
> normalization. I probably should have introduced myself some time ago to share
> results of my research and my ideas, but I also wanted to wait until I could
> illustrate my points with some code.
> 

Very cool! This is a problem I ran into with Japanese unicode string comparisons a \
while ago so it is great it will be addressed! Then we will have to investigate the \
CompareStringW, and family, behavior.


> - Mappings for characters above 0xFFFF are encoded as UTF-16 (using surrogate
> pairs), but a single codepoint (UTF-32 if you like) is used for table
> indexing. Setting $utflim in make_unicode to 65536 is the simplest way to
> disable support for such characters, but supporting surrogate pairs should
> not affect any text-related Wine component in a negative way.
> 

There is some super basic work on non-BMP unicode glyphs and surrogate pairs in \
Uniscribe (usp10).  I wrote a quick decode_surrogate_pair() function to help get a \
DWORD unicode value for the surrogate pair. So you can look at that if you are \
interested!

Thanks!
-aric


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic