[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [Bug 55177] URI<->IRI conversion uses page encoding instead of UTF-8
From:       Thiago Macieira <thiago () kde ! org>
Date:       2007-05-16 11:22:41
Message-ID: 20070516112241.29352.qmail () ktown ! kde ! org
[Download RAW message or body]

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
         
http://bugs.kde.org/show_bug.cgi?id=55177         




------- Additional Comments From thiago kde org  2007-05-16 13:22 -------
I am aware of the issue. But, as far as I know, there's no proper solution for this \
problem. So, I'm letting the problem remain unsolved.

IRIs (which include local files) indicate that URLs are to be interpreted as a \
sequence of Unicode characters, encoded in UTF-8. That means that, if you write  <a \
href="müller.html"> or even <a href="m&uuml;ller.html"> The URI fragment contains \
one non-ASCII Unicode codepoint (U+00FC, independent of the page's encoding). That is \
correctly translated to "m%C3%BCller.html". Let me repeat: *correctly* translated.

One possible solution is to mandate that people write HTML pages referring to local \
files using %-encoding if their files aren't named in UTF-8. That would be suboptimal \
because the URLs in Konqueror's Location field don't exactly correspond to the \
directory names displayed down below. It would also break a lot of internal \
assumptions because filenames are kept internally as Unicode data and so are URLs. \
But creating URLs out of Unicode filenames requires decoding to the 8-bit format and \
recoding in UTF-8.

In other words, a code section like:
  QString path = QFile::decodeName("/foo/Vidéo");
  url.setPath(path);
would produce different encodings depending on whether the schema (protocol) of "url" \
is "file". Worse, it also affects file-like protocols like "media", "system", "nfs".

So, in conclusion, I will not spend any effort fixing that problem. Switch to UTF-8 \
already. If that one doesn't work, I will fix.

(This discussion doesn't affect IRIs)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic