From koffice Fri Mar 17 16:06:46 2006 From: Nicolas Goutte Date: Fri, 17 Mar 2006 16:06:46 +0000 To: koffice Subject: [Bug 123672] RTF - kword doesn't recognize lang/charset settings Message-Id: <20060317160646.21546.qmail () ktown ! kde ! org> X-MARC-Message: https://marc.info/?l=koffice&m=114261686810380 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=123672 ------- Additional Comments From nicolasg snafu de 2006-03-17 17:06 ------- On Friday 17 March 2006 16:49, Mikolaj Machowski wrote: (...) > > Then there is the \ansicpg keyword to set a codepage. > > After adding just \ansicpg1250 immediately after \ansi KWord displays > previously attached document as it was indented (all Polish characters > visible). That is good. (The document could be even more "wrong".) > > > > Aaahhh. Checked RTF 1.6 doc and it looks like by design RTF doesn't > > > support native encodings?! > > > > It does, as it defines the keywords that I have listed above. > > Sorry, misunderstood, pre-1.6 versions officially didn't support them. It depends. \pc \pca \mac and \ansi are already existing since WinWord 1.x (so probably RTF 1.2). Only \ansicpg is relatively recent. > Even now \ansicpg is rather for proper translation of UTF than native > encodings. Why? The \u keyword does not need to know the encoding of the file. > 8-bit characters are only as a side effect: On contrary, I think that it is the primary goal. > (Converters that > communicate with Microsoft Word for Windows or Microsoft Word for the > Macintosh should expect 8-bit characters.) The problem is that basically RTF is a 7 bit file format, as at the time RTF 1.0 was defined major U.S. networks were not 8 bit clean. Until RTF 1,2, it was made a little less U.S but you had to encode the characters with \' if they were not 7 bit clean. Nowadays it should be 8 bit clean. > > > It worries me that \pc and \ansi would perhaps not mean a particular > > codepage but just the locale MS-DOS respectively Windows codepages. If > > that is the case, then the RTF filter need quite an improvement. > > I am afraid this is the case. Also possible is that MS-programs are > just guessing encoding depending on locale > or perform additional tests > to display properly. > You could check OO.o code - oowriter displays > document without problems. It is rather difficult to read OOo's code. Have a nice day! ____________________________________ koffice mailing list koffice@kde.org To unsubscribe please visit: https://mail.kde.org/mailman/listinfo/koffice