From kde-devel Fri Aug 20 22:21:28 1999 From: Lars Knoll Date: Fri, 20 Aug 1999 22:21:28 +0000 To: kde-devel Subject: Re: Unicode - codepages - ??? X-MARC-Message: https://marc.info/?l=kde-devel&m=93518771122353 On Fri, 20 Aug 1999, Werner Trobin wrote: > Lars Knoll wrote: > > > > On Fri, 20 Aug 1999, Harri Porten wrote: > > > > > Werner Trobin wrote: > > > > > > > > Hi! > > > > > > > > I'm reading text out of WinWord files. These characters are either > > > > stored as Unicode glyphs or as compressed Unicode characters (my > > > > documenation says Ms Word uses codepage 1252 for that). > > > > I want to read those characters (and if they are compressed covert > > > > them to Unicode). > > > > > > > > My questions are: May I use Qt to convert the codepage 1252 chars to > > > > "normal" two byte Unicode chars? What do I have to do? Where do I > > > > get this codepage from? Is it available on every supported platform > > > > in every country, or do I have to write my own "mapper" (from > > > > codepage 1252 to Unicode)? > > > > > > I don't know anything but codepage 1252 but Qt _can_ be used to do the > > > conversion. At least I got that impression when looking at the > > > QTextCodec. I saw stuff like UTF-8 in the sources. > > > > QTextCodecs are the way to go. But AFAIK, Qt doesn't have a codec for > > CP1252 up to now. You could perhaps make one (it's quite simple, have a > > look at the defined codecs in $QTDIR/src/tools), add it to your filter or > > even submit it to the trolls for inclusion in their next release. > > Now I found out that CP1252 is Latin 1 and AFAIK QString has a static member > QString::fromLatin1(...) Should I use this one? I think so, but why is it > impossible to get e.g. a "EURO" symbol (the cool e :) > > Werner, clueless about that charsets... > Because CP1252 unfortunately isn't latin1, but latin1 with MS extensions (I want to see one time, when this company keeps to standards). As far as I remember, they differ in the region 0x80-0xA0... Cheers, Lars