[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ast-developers
Subject:    [ast-developers] Re: [ast-users] read -d command not supporting non-ASCII/Unicode chars
From:       "Clark J. Wang" <dearvoid () gmail ! com>
Date:       2012-04-24 9:55:22
Message-ID: CADv8-ogPLEzsM6oTeNSKotf1-i+2nhiezCKVLyuYE=QSuF7Vag () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Tue, Apr 10, 2012 at 07:00, Lionel Cons <lionelcons1972@googlemail.com>w=
rote:

> Hello, we're encountering a serious bug when migrating some of our
> data processing scripts from bash to ksh93. After a lengthily
> investigation we traced the problem down to ksh93's read command not
> supporting non-ASCII characters as delimiters (read -d) in the Unicode
> locale (en_GB.UTF-8).
>
> Testcase:
> Try to read the string "x123=E2=82=AChello" via read and use the EURO sym=
bol
> (Unicode 20AC) as delimiter:
>
> zsh returns the expected behaviour:
> > zsh -c 'printf "x123=E2=82=AChello\n" | (read -d "=E2=82=AC" r ; printf=
 "|%s|\n" "$r")'
> |x123|
>
> bash returns the expected behaviour, too:
> > bash -c 'printf "x123=E2=82=AChello\n" | (read -d "=E2=82=AC" r ; print=
f "|%s|\n" "$r")'
> |x123|
>
> ksh93 in Redhat and Debian Linux FAILS:
> > ksh -c 'printf "x123=E2=82=AChello\n" | (read -d "=E2=82=AC" r ; printf=
 "|%s|\n" "$r")'
> |x123=E2=82=AChello
> |
>
>
> Any help would be appreciated because our migration deadline is soon
> (end of may 2012).
>

Not sure how heavily you depend on this feature. Do other utils like awk
help?

BTW, why the bash to ksh migration for existing scripts? If you prefer ksh
you can write new scripts in ksh. I don't think it's a good idea to convert
lots of bash scripts which are working well to ksh. :)

>
> Lionel
>
> _______________________________________________
> ast-users mailing list
> ast-users@research.att.com
> https://mailman.research.att.com/mailman/listinfo/ast-users
>

[Attachment #5 (text/html)]

<div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 10, 2012 at 07:00, \
Lionel Cons <span dir="ltr">&lt;<a href="mailto:lionelcons1972@googlemail.com" \
target="_blank">lionelcons1972@googlemail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">Hello, we&#39;re encountering a serious bug when migrating \
some of our<br> data processing scripts from bash to ksh93. After a lengthily<br>
investigation we traced the problem down to ksh93&#39;s read command not<br>
supporting non-ASCII characters as delimiters (read -d) in the Unicode<br>
locale (en_GB.UTF-8).<br>
<br>
Testcase:<br>
Try to read the string &quot;x123€hello&quot; via read and use the EURO symbol<br>
(Unicode 20AC) as delimiter:<br>
<br>
zsh returns the expected behaviour:<br>
&gt; zsh -c &#39;printf &quot;x123€hello\n&quot; | (read -d &quot;€&quot; r ; \
printf &quot;|%s|\n&quot; &quot;$r&quot;)&#39;<br> |x123|<br>
<br>
bash returns the expected behaviour, too:<br>
&gt; bash -c &#39;printf &quot;x123€hello\n&quot; | (read -d &quot;€&quot; r ; \
printf &quot;|%s|\n&quot; &quot;$r&quot;)&#39;<br> |x123|<br>
<br>
ksh93 in Redhat and Debian Linux FAILS:<br>
&gt; ksh -c &#39;printf &quot;x123€hello\n&quot; | (read -d &quot;€&quot; r ; \
printf &quot;|%s|\n&quot; &quot;$r&quot;)&#39;<br> |x123€hello<br>
> <br>
<br>
<br>
Any help would be appreciated because our migration deadline is soon<br>
(end of may 2012).<br></blockquote><div><br>Not sure how heavily you depend on this \
feature. Do other utils like awk help?<br><br>BTW, why the bash to ksh migration for \
existing scripts? If you prefer ksh you can write new scripts in ksh. I don&#39;t \
think it&#39;s a good idea to convert lots of bash scripts which are working well to \
ksh. :)<br>

</div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px \
solid rgb(204,204,204);padding-left:1ex"> <br>
Lionel<br>
<br>
_______________________________________________<br>
ast-users mailing list<br>
<a href="mailto:ast-users@research.att.com">ast-users@research.att.com</a><br>
<a href="https://mailman.research.att.com/mailman/listinfo/ast-users" \
target="_blank">https://mailman.research.att.com/mailman/listinfo/ast-users</a><br> \
</blockquote></div><br></div>



_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic