[prev in list] [next in list] [prev in thread] [next in thread]
List: whatwg
Subject: Re: [whatwg] Encoding Sniffing
From: Alexey Proskuryakov <ap () webkit ! org>
Date: 2012-04-23 17:58:17
Message-ID: FAA095D7-6F14-4F76-9D30-8574C9B3D019 () webkit ! org
[Download RAW message or body]
21.04.2012, =D7 3:21, Anne van Kesteren =CE=C1=D0=C9=D3=C1=CC(=C1):
> 1) Is this something we want to define and eventually implement the =
same way?
I think that the general direction should be getting rid of encoding =
sniffing. It's very rarely helpful if ever, and implementations are =
wildly different.
WebKit can optionally use ICU for charset detection. We also have custom =
built-in heuristics to switch between Japanese encodings only (think =
rendering unlabeled EUC-JP pages when default browser encoding is set to =
Shift-JIS). Safari doesn't enable ICU based detection to no visible user =
disconcert, and I don't know if the Japanese heuristics are still =
important.
> 2) Does this need to apply outside HTML? For JavaScript it forbidden =
per the HTML standard at the moment. CSS and XML do not allow it either. =
Is it used for decoding text/plain at the moment?
> 3) Is there a limit to how many bytes we should look at?
Related to the last question, WebKit doesn't implement re-navigation =
(neither for charset sniffing, nor for <meta charset>), and I don't =
think that we ever should.
- WBR, Alexey Proskuryakov
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic