[prev in list] [next in list] [prev in thread] [next in thread]
List: nix-dev
Subject: Re: [Nix-dev] Unicode locale for build environments
From: Freddy Rietdijk <freddyrietdijk () fridh ! nl>
Date: 2017-06-25 16:04:38
Message-ID: CAOQtOH3J5q4+BSt7WjvdhLTrbVG4fzQkBEM+YqDOD+mNpm3J5A () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Earlier discussion on the issue tracker about glibcLocales and C.UTF-8.
https://github.com/NixOS/nixpkgs/issues/20192
For Python 3.x I'm of the opinion we could add a minimal glibcLocales that
provides en_US.UTF-8 and sets LC_ALL in `buildPythonPackage`. This is only
for build-time, not run-time.
On Sun, Jun 25, 2017 at 5:57 PM, Benno Fünfstück <
benno.fuenfstueck@gmail.com> wrote:
> Hello list,
>
> right now, the stdenv appears to not set any locale. I think this means
> that the locale defaults to C, which specifies ASCII as the character
> encoding. For example, python then defaults to `ASCII` so it will fail if
> any script tries to open a file with non-ascii characters:
>
> $ nix-shell --pure -p python36 --command 'python -c "import locale;
> print(locale.getpreferredencoding())"'
> ANSI_X3.4-1968
>
> Just recently, I've hit a build that failed due to that:
>
> Traceback (most recent call last):
> File "nix_run_setup.py", line 8, in <module>
> exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\\r\\n',
> '\\n'), __file__, 'exec'))
> File "setup.py", line 20, in <module>
> long_description=open('README.rst').read(),
> File "/nix/store/i5ixvcy4i6jqzlzy9aajdhf3wliixv
> h1-python3-3.6.1/lib/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 542:
> ordinal not in range(128)
>
> As UTF-8 is the nowadays almost always used (I have yet to see a source
> archive that does not use UTF-8), I propose that we make the stdenv support
> UTF-8 by default. Would this be a feasible approach? (whether to use
> C.UTF-8 or some other UTF-8 locale like en_US.UTF-8 still needs to be
> decided)
>
> Regards,
> Benno
>
> _______________________________________________
> nix-dev mailing list
> nix-dev@lists.science.uu.nl
> https://mailman.science.uu.nl/mailman/listinfo/nix-dev
>
>
[Attachment #5 (text/html)]
<div dir="ltr">Earlier discussion on the issue tracker about glibcLocales and \
C.UTF-8.<div><a href="https://github.com/NixOS/nixpkgs/issues/20192">https://github.com/NixOS/nixpkgs/issues/20192</a><br></div><div><br></div><div>For \
Python 3.x I'm of the opinion we could add a minimal glibcLocales that provides \
en_US.UTF-8 and sets LC_ALL in `buildPythonPackage`. This is only for build-time, not \
run-time.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, \
Jun 25, 2017 at 5:57 PM, Benno Fünfstück <span dir="ltr"><<a \
href="mailto:benno.fuenfstueck@gmail.com" \
target="_blank">benno.fuenfstueck@gmail.com</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Hello list,<div><br></div><div>right now, the \
stdenv appears to not set any locale. I think this means that the locale defaults to \
C, which specifies ASCII as the character encoding. For example, python then defaults \
to `ASCII` so it will fail if any script tries to open a file with non-ascii \
characters:</div><div><br></div><div>$ nix-shell --pure -p python36 --command \
'python -c "import locale; \
print(locale.<wbr>getpreferredencoding())"'</div><div>ANSI_X3.4-1968</div><div><br></div><div>Just \
recently, I've hit a build that failed due to \
that:</div><div><br></div><div><div>Traceback (most recent call last):</div><div> \
File "nix_run_setup.py", line 8, in <module></div><div> \
exec(compile(getattr(tokenize, 'open', \
open)(__file__).read().<wbr>replace('\\r\\n', '\\n'), __file__, \
'exec'))</div><div> File "setup.py", line 20, in \
<module></div><div> \
long_description=open('README.<wbr>rst').read(),</div><div> File \
"/nix/store/<wbr>i5ixvcy4i6jqzlzy9aajdhf3wliixv<wbr>h1-python3-3.6.1/lib/python3.<wbr>6/encodings/ascii.py", \
line 26, in decode</div><div> return codecs.ascii_decode(input, \
self.errors)[0]</div><div>UnicodeDecodeError: 'ascii' codec can't decode \
byte 0xc3 in position 542: ordinal not in \
range(128)</div></div><div><br></div><div>As UTF-8 is the nowadays almost always used \
(I have yet to see a source archive that does not use UTF-8), I propose that we make \
the stdenv support UTF-8 by default. Would this be a feasible approach? (whether to \
use C.UTF-8 or some other UTF-8 locale like en_US.UTF-8 still needs to be \
decided)</div><div><br></div><div>Regards,</div><div>Benno</div></div> \
<br>______________________________<wbr>_________________<br> nix-dev mailing list<br>
<a href="mailto:nix-dev@lists.science.uu.nl">nix-dev@lists.science.uu.nl</a><br>
<a href="https://mailman.science.uu.nl/mailman/listinfo/nix-dev" rel="noreferrer" \
target="_blank">https://mailman.science.uu.nl/<wbr>mailman/listinfo/nix-dev</a><br> \
<br></blockquote></div><br></div>
_______________________________________________
nix-dev mailing list
nix-dev@lists.science.uu.nl
https://mailman.science.uu.nl/mailman/listinfo/nix-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic