[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tor-dev
Subject:    Re: [tor-dev] Proposal 285: Directory documents should be standardized as UTF-8
From:       Alex Xu <alex_y_xu () yahoo ! ca>
Date:       2018-01-10 1:36:22
Message-ID: 151554818207.20265.1862773707493769235 () pink ! alxu ! ca
[Download RAW message or body]

Quoting teor (2018-01-10 00:19:54)
> These are called "Unicode Scalar Values".
> https://www.unicode.org/glossary/#unicode_scalar_value
> 
> Let's reference that.

"Unicode Scalar Value" includes U+0, which I think we probably want to
exclude.

> >        * each encoded with the shortest possible encoding.
> >        * without any BOM
> > 
> > Are there other restrictions we should make?  If so, how should we phrase them?
> 
> These seem fine, and not tied to a particular unicode version.
> 
> But I don't know enough about Unicode to know if there is anything else we should
> specify.

Skimming through
https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt, I think it
might be good to additionally forbid the code points listed at the end:
U+nFFF{E,F} for n = 0..10, and U+FDD0 through U+FDEF.
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic