From kde-core-devel Fri Jul 12 08:09:52 2002 From: Stephan Hermann Date: Fri, 12 Jul 2002 08:09:52 +0000 To: kde-core-devel Subject: Re: PATCH: kdelibs/kdecore/kstringhandler.cpp tagURLs() method X-MARC-Message: https://marc.info/?l=kde-core-devel&m=102646135016968 MIME-Version: 1 Content-Type: multipart/mixed; boundary="--------------Boundary-00=_GOM49BDD92FBOPCCSNMM" --------------Boundary-00=_GOM49BDD92FBOPCCSNMM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, On Thursday 11 July 2002 21:34, Marc Mutz wrote: > On Thursday 11 July 2002 13:11, Stephan Hermann wrote: > > > > Well, I changed the regexp in this way: > > (cleaned up C-string quoting) > (?:www\.|ftp\.|\w+\://)[\d\w\.]+[:\d]{0,}[/]{0,1}[~/\.-?&=3D#:_\d\w]{0,= } > ^1 ^2 ^3 ^4 > 1: wouldn't this be "*" No. =09"*" =3D=3D matching 0 or more times =09+ =3D=3D matching 1 or more times We want to check URLs not shortcuts like gg: or some malformed urls out o= f the=20 kurl testsuite ;) so + is the right choice...but I got a mistake in here and fixed it (good= =20 point with testsuites, thx harri :)) a "/" was missing. > 2: a class with a single char? you mean \d ? you're right..this is one of the mistakes > 3: wouldn't this be "?" sure > 4: see (1) nope > > And at least, the replacement in hrefProtocol I changed, too. > > Not in this way, you described in your last mail, but I used sprintf > > and QString::latin1() > > > > How exactly? Since QString::sprintf() assumes %s parameters to be in > UTF-8, not latin1. Forget about it ;) i used now the concatenation solution. I put some tests into kdelibs/kdecore/tests/kstringhandlertest.cpp (see t= he=20 attached patch) BTW: I'll fixed a kstringhandler testsuite bug, too. the test for revert was wrong. the "has to be" string missed a sp= ace=20 char at the beginning (see patch) Regards, \sh - --=20 St. Hermann, Troisdorf One solution for a simple problem: A7 B4 C2 D5 E8 F1 G3 H6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9Lo7SV8AnusWiV6wRAihZAKDRJ9bYuMd2Mu7BaowkW1sLmhU+lQCg0swy j31ZSLRNWyp3NDRcifeqgMM=3D =3D5RjR -----END PGP SIGNATURE----- --------------Boundary-00=_GOM49BDD92FBOPCCSNMM Content-Type: text/x-diff; charset="iso-8859-1"; name="kstringhandler.cpp.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="kstringhandler.cpp.patch" Index: kstringhandler.cpp =================================================================== RCS file: /cvs/kdelibs/kdecore/kstringhandler.cpp,v retrieving revision 1.16 diff -r1.16 kstringhandler.cpp 517,531c517,538 < /*static*/ QRegExp urlEx("(www\\.|(f|ht)tp(|s)://)[\\d\\w./,:_~\\?=&;#-]+[\\d\\w/]"); < < QString richText( text ); < int urlPos = 0, urlLen; < while ((urlPos = urlEx.search(richText, urlPos)) >= 0) < { < urlLen = urlEx.matchedLength(); < QString href = richText.mid( urlPos, urlLen ); < QString anchor( "%2" ); < anchor = anchor.arg( href ).arg( href ); < richText.replace( urlPos, urlLen, anchor ); < < urlPos += anchor.length(); < } < return richText; --- > // /*static*/ QRegExp urlEx("(www\\.|(f|ht)tp(|s)://)[\\d\\w./,:_~\\?=&;#-]+[\\d\\w/]"); > // Changed by St. Hermann > QRegExp urlEx("(?:www\\.|ftp\\.|\\w+\\://)[\\/\\d\\w\\.]+[:\\d+]?[/]{0,1}[%~/\\.-?&=#:_\\d\\w]{0,}"); > QString richText( text ); > int urlPos = 0, urlLen; > while ((urlPos = urlEx.search(richText, urlPos)) >= 0) > { > urlLen = urlEx.matchedLength(); > QString href = richText.mid( urlPos, urlLen ); > QString hrefProtocol; > QString anchor; > if (href.startsWith("www.")) { > hrefProtocol="http://"+href; > } else if (href.startsWith("ftp.")) { > hrefProtocol="ftp://"+href; > } else { > hrefProtocol=href; > } > anchor = ""+href+""; richText.replace( urlPos, urlLen, anchor ); > urlPos += anchor.length(); > } > return richText; --------------Boundary-00=_GOM49BDD92FBOPCCSNMM Content-Type: text/x-diff; charset="iso-8859-1"; name="kstringhandlertest.cpp.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="kstringhandlertest.cpp.patch" Index: tests/kstringhandlertest.cpp =================================================================== RCS file: /cvs/kdelibs/kdecore/tests/kstringhandlertest.cpp,v retrieving revision 1.3 diff -r1.3 kstringhandlertest.cpp 52c52 < "bridge. lazy the over jumped fox brown quick The"); --- > " bridge. lazy the over jumped fox brown quick The"); 80a81,105 > > // KStringHandler::tagURLs() Test > result=KStringHandler::tagURLs("http://www.foo.bar:80"); > check("tagURLs(\"http://www.foo.bar:80\")", > result, > "http://www.foo.bar:80"); > result=KStringHandler::tagURLs("http://www.foo.bar"); > check("tagURLs(\"http://www.foo.bar\")", > result, > "http://www.foo.bar"); > result=KStringHandler::tagURLs("http://www.foo.bar/top//test2/file2.html"); > check("tagURLs(\"http://www.foo.bar/top//test2/file2.html\")", > result, > "http://www.foo.bar/top//test2/file2.html"); > > result=KStringHandler::tagURLs("file:///home/sh/my%20tar%20file.tgz"); > check("tagURLs(\"file:///home/sh/my%20tar%20file.tgz\")", > result, > "file:///home/sh/my%20tar%20file.tgz"); > result=KStringHandler::tagURLs("myProt://LookWhereI:8002/doing/this/index.asp"); > check("tagURLs(\"myProt://LookWhereI:8002/doing/this/index.asp\")", > result, > "myProt://LookWhereI:8002/doing/this/index.asp"); > > --------------Boundary-00=_GOM49BDD92FBOPCCSNMM--