[prev in list] [next in list] [prev in thread] [next in thread] 

List:       coreutils-bug
Subject:    bug#9365: Example
From:       "Marton Kadar" <marton.kadar () mail ! com>
Date:       2012-02-24 14:18:24
Message-ID: 20120224141824.107150 () gmx ! com
[Download RAW message or body]

Environment for Hungary where á and í are proper lowercase letters
but for example Spanish has these letters too:

$ set | grep ^L
LANG=hu_HU.UTF-8
LC_ALL=hu_HU.UTF-8
LINES=73
LOGNAME=kadar1marto518

Now let's see the bytestream for the following string
(which means flood in Hungarian):

$ echo árvíz | od -c
0000000 303 241   r   v 303 255   z  \n
0000010

Let us try to delete a character and see if it worked:

$ echo árvíz | tr -d á | od -c
0000000   r   v 255   z  \n
0000005

Correct expected behavior would rather be:

$ echo árvíz | tr -d á | od -c
0000000   r   v 303 255   z  \n
0000006

I'll check the source for tr myself although never coded in C.
This should be a trivial fix. The problem is especially annoying
as we currently have no real simple and good general purpose case
conversion tool. (correct me if I'm wrong, but tr should be this
tool).

Marton Kadar



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic