[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    RE: A more flexible solution for internet keywords
From:       Andreas Hochsteger <e9625392 () student ! tuwien ! ac ! at>
Date:       2001-06-29 15:36:33
[Download RAW message or body]

Hi Yves!

On Thu, 28 Jun 2001, Yves Arrouye wrote:

[...]

> >
> > Here's an example:
> > I enter "keyword1 keyword2" into the address bar, which gets
> > substituted
> > by the following link:
> > http://navigation.realnames.com/resolver.dll?action=navigation
> > &realname=keyword1%20keyword2&charset=iso-8859-1&providerid=18
> > 0&fallbackuri=http%3A//www.google.com/search%3Fq%3Dkeyword1%20keyword2
> > Can you have a look at this URL, if there's a mistake somewhere?
> > Try to put it into your konqueror. Does it bring you up a
> > download dialog
> > too?
> > If so, and if the URL is correct, perhaps there is somwhere another
> > mistake in konqueror.
>
> The URL doesn't look correct. First, I would expect to see
>
> 	charset=utf-8

When you look at the original code, there is always iso-8859-1 the default
charset (for charset and responsecharset). Does it mean, that this has to
be changed? I always get iso-8859-1 for my queries.

>
> since Keywords queries should always be sent in UTF-8 (nothing prevents you
> from entering a Chinese Keyword, even if your fallback of Google doesn't
> understand that). And then,
>
> 	responsecharset=iso-8859-1
>
> for Google. Finally, the \1 in the fallbackuri should *NOT* be touched by
> you but passed as is to the Keywords servers. I guess it does not matter
> much if you encode it in the proper charset, but I'd rather keep it the way
> it was, were the Keywords router reencode it (this way, such reencoding is
> always consistent).

I see.
I really didn't know that the realname servers use \1 to replace the query.
That's really hard to support with the new syntax, since \1 isn't
used anymore the way it was before. It will be recognized as old format
and treated with the compatibility mode (\1 -> \{0})

The only solutions for this problem I see is the following:
Don't use \1 in the realname url anymore, but generate the whole fallback
uri on the client side.
The user query will appear twice in the realnames uri:
Once in the realname= term and once in fallbackuri= as argument to the
according fallback search engine whithout a \1 in the resulting uri.
I think it doesn't matter, if we form the fallback uri on the client side
and don't let this be done by the realname server. Another advantage is,
that if someone chooses a fallback search engine, with a more
sophisticated query definition, it can be used too with realnames and
benefit from the advantages of the new query possibilities.
Is this right, or am I missing something?

The only remaining problem is, that konqueror offers you the resolver.dll
file to download, instead of interpreting it correctly as redirection.
But I've got a hint, how this could be solved:
It seems, that the fallback uri has to be sent to realnames with twice
encoding, what means, if you have a space in the query for google it
should look like %20 in the uri for google and %2520 (%->%25) in the uri
for realnames. A quick test confirmed this guess, but I want to look at it
with more detail.

[...]

> > What do you think of the following names:
> > ikw ... Internet KeyWord
> > wsc ... Web ShortCut
>
> Sounds good to me. IKW and WSC.
>
> > \ikw_fallbackuri
> > \wsc_charset
> > \wsc_responsecharset
> >
> > But doesn't resolver.dll from realnames.com have to do with
> > the internet
> > shortcuts? That's where the charset and the responsecharset
> > are used...
>
> Well, that's where one sees they're tightly integrated :) What happens is

Perhaps we should do a minor redesign and offer a function which only
copes with substitution (pass query definition and substitution map and
return the substituted url):
	QString substituteURI(QString query, SubstMap map);
So we can divide the substitution algorithm from the other logic. A more
clear destinction between internet keywords and web shortcuts would be
nice. What do you think?

> the following: the charset and responecharset are part of the RealNames
> Keywords public interface. If you get a listing (these ugly listings nobody
> likes), then it will be prodiuced in the value of responsecharset. Now, when
> there is a fallback URI, the meaning of responsecharset is that this is the
> charset that the fallback URI expects for its \1 (or \0) parameter. This
> allows to send the query encoded in UTF-8 to the Keywords machine, but to
> re-encode the query for what the follow URI expects if there is no Keyword.
> So for example:
>
> 	You type something in Konqueror, and Google (charset Latin1) is your
> fall back.
> 	The query goes in UTF-8 to the RealNames servers, so if you type a
> Chinese Keyword it works.
> 	If there is no such Keyword, the query is reencoded in Latin1 before
> being handed off tp Google.
>
> If the reencoding didn't happen, and you typed my keyword, Ladédé, with some
> extra chars, as in "Que fait Ladédé cet été?" then Google will get a UTF-8
> query, thinking it's Latin1, and will get all confused.
>
> So I would just use \wsc_charset for a name, and the fallback URI would have
> responsecharset=\wsc_charset in it if you want, instead of the \3 it has
> now.

I don't see the difference between charset and responsecharset here.
Don't you mean charset=\{wsc_charset} and
responsecharset=\{wsc_responsecharset}?
Note, that the latest patch implements the other syntax, you suggested
with some enhancements (See mail from Sunday, 24th of June for details).
This patch used another naming (uri_charset, uri_responsecharset), since
the naming was not yet clear to me.

>
> > I'd volunteer to do so, but I'd need some assistance to do it right.
> > Could you provide me an overview about how all this works together?
>
> Is the above detailed enough?

Yes, it made some issues clear to me, which I didn't understand correctly.
I'll do some more polishing, until it's ready to be commited into cvs
(hopefully before 2.2 ;-).

>
> YA
>

Thanks,

	Andreas

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic