[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-core-devel
Subject: [PATCH]: KNewsTicker querying non-ISO8859-1 sites
From: Frerich Raabe <frerichraabe () gmx ! de>
Date: 2002-03-07 16:53:43
[Download RAW message or body]
Hi,
the attached patch, courtesy of Volker Augustin
<volker.augustin@perfektionismus.de> apparently makes cyrillic characters as
well as german umlauts work in KNewsTicker. I hope it makes asian charsets
work as well, but I didn't yet find a suitable font.
You can use the URLs http://www.slashdot.jp/slashdot.rdf (japanese),
http://www.hamovniki.net/~d00mer/lenta_rdf/lenta.rdf (russian) and
http://www.heise.de/newsticker/heise.rdf (german) to test. Just in case
somebody has one of those giant unicode fonts handy.
Ok to commit?
- Frerich
["xmlnewsaccess2.diff" (text/x-diff)]
Index: kdenetwork/knewsticker/common/xmlnewsaccess.cpp
===================================================================
RCS file: /home/kde/kdenetwork/knewsticker/common/xmlnewsaccess.cpp,v
retrieving revision 1.27
diff -u -r1.27 xmlnewsaccess.cpp
--- kdenetwork/knewsticker/common/xmlnewsaccess.cpp 2002/02/09 22:33:37 1.27
+++ kdenetwork/knewsticker/common/xmlnewsaccess.cpp 2002/03/06 15:02:18
@@ -16,6 +16,7 @@
#include <qdom.h>
#include <qregexp.h>
+#include <qtextcodec.h>
XMLNewsArticle::XMLNewsArticle(const QString &headline, const KURL &address)
: m_headline(headline),
@@ -72,9 +73,36 @@
if (okSoFar) {
QDomDocument domDoc;
// Some servers like to prepend a blank line, QDom doesn't like that...
- if (validContent = domDoc.setContent(QCString(data).stripWhiteSpace())) {
+ if (validContent = domDoc.setContent(QString(data).stripWhiteSpace())) {
+
+ /*
+ * Detect the encoding and create a suitable QTextCodec object.
+ * If a XML processing instruction is present, it should be of
+ * the following form:
+ * <?xml version = "1.0" encoding = "ISO-8859-1"?>
+ * where the encoding attribute need not necessarily be present
+ * (e.g. slashdot.org omits the encoding).
+ * This should then be in the first node of the document which
+ * in turn should be of type QDomProcessingInstruction.
+ */
+ QTextCodec *codec = 0;
+
+ QDomNode firstNode = domDoc.firstChild();
+ if ( firstNode.isProcessingInstruction() ) {
+ QString data = firstNode.toProcessingInstruction().data();
+ QString encKey = QString::fromLatin1( "encoding" );
+ if ( data.contains( encKey ) ) {
+ QString containingPart = data.mid( data.find(encKey) );
+ QString encoding = containingPart.section( ' ', 2, 2 );
+ encoding = encoding.mid( 1, encoding.length() - 2 );
+ kdDebug(5005) << QString::fromLatin1( "Encoding: " ) << encoding << endl;
+
+ codec = QTextCodec::codecForName(encoding.latin1());
+ }
+ }
+
QDomNode channelNode = \
domDoc.documentElement().namedItem(QString::fromLatin1("channel"));
-
+
m_name = channelNode.namedItem(QString::fromLatin1("title")).toElement().text().simplifyWhiteSpace();
m_link = channelNode.namedItem(QString::fromLatin1("link")).toElement().text().simplifyWhiteSpace();
m_description = channelNode.namedItem(QString::fromLatin1("description")).toElement().text().simplifyWhiteSpace();
@@ -85,7 +113,11 @@
QString headline, address;
for (unsigned int i = 0; i < items.count(); i++) {
itemNode = items.item(i);
- headline = decodeEntities(itemNode.namedItem(QString::fromLatin1("title")).toElement().text().simplifyWhiteSpace());
+ QString title = \
itemNode.namedItem(QString::fromLatin1("title")).toElement().text().simplifyWhiteSpace();
+ if ( codec != 0 ) {
+ title = codec->toUnicode( title.latin1() );
+ }
+ headline = decodeEntities( title );
address = decodeEntities(itemNode.namedItem(QString::fromLatin1("link")).toElement().text().simplifyWhiteSpace());
m_articles.append(XMLNewsArticle(headline, address));
}
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic