[prev in list] [next in list] [prev in thread] [next in thread] 

List:       busybox
Subject:    Q: applet sed - problems matching hex
From:       Bastian Bittorf <bittorf () bluebottle ! com>
Date:       2015-10-28 11:15:54
Message-ID: 20151028111554.GE12330 () medion ! lan
[Download RAW message or body]

i have a file in french language where i
want to replace chars to HTML-entities.
it is attached and I uploaded it[1].

replace hex seems not to work, no matter
if compiling with musl or uclibc and no
matter if on arm or x86.

# see 2nd line at the end: "c3 a9")

root@box:~ hexdump -C myfile
00000000  30 2e 20 43 6f 6e 64 69  74 69 6f 6e 73 20 64 27  |0.  Conditions d'|
00000010  75 74 69 6c 69 73 61 74  69 6f 6e 20 67 c3 a9 6e  |utilisation g..n|
00000020  c3 a9 72 61 6c 65 73 20  70 6f 75 72 20 6c 27 61  |..rales pour l'a|

root@box:~ sed 's/\xc3\xa9/__/g' myfile | hexdump -C
00000000  30 2e 20 43 6f 6e 64 69  74 69 6f 6e 73 20 64 27  |0.  Conditions d'|
00000010  75 74 69 6c 69 73 61 74  69 6f 6e 20 67 c3 a9 6e  |utilisation g..n|
00000020  c3 a9 72 61 6c 65 73 20  70 6f 75 72 20 6c 27 61  |..rales pour l'a|

so it does not change/match. looking over the source of sed.c
it seems that all the regex stuff is done in the libc.
so i have no idea, but maybe somebody of you?

thanks for your work!

[1] http://intercity-vpn.de/files/fr.txt
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic