[prev in list] [next in list] [prev in thread] [next in thread] 

List:       apache-httpd-dev
Subject:    Re: svn commit: r1908301 - in /httpd/httpd/trunk: changes-entries/rewrite-bctls docs/manual/rewrite/
From:       Ruediger Pluem <rpluem () apache ! org>
Date:       2023-03-13 14:44:25
Message-ID: e4ad0dbf-4ef5-d3f9-9d51-f226996cf613 () apache ! org
[Download RAW message or body]



On 3/13/23 2:57 PM, Yann Ylavic wrote:
> On Mon, Mar 13, 2023 at 2:32 PM Eric Covener <covener@gmail.com> wrote:
> > 
> > On Mon, Mar 13, 2023 at 8:31 AM Yann Ylavic <ylavic.dev@gmail.com> wrote:
> > > 
> > > Maybe we could make [B=] and [BTCLS] not mutually exclusive (encode
> > > both controls and whatever in B=).
> > > I was thinking of a [BNEG] flag too (encode everything but what's in
> > > B=), and never encode alnum or '_', so all in all some further patch
> > > like the attached one. WDYT?
> > 
> > Looks good to me, I have an ancient patch where I did something very
> > similar to a copy of int:escape where you could set exceptions in
> > subprocess_env. The config is ugly so I never upstreamed it.
> > 
> > if you make the change I can add some tests/doc. Should caution
> > against plain [B] and refer to the others in the doc?
> 
> r1902323, thanks for the tests/doc.
> IIUC, when the query-string is rewritten, I think we should caution
> against using [B] with a redirect (double encoding) and not using
> [BCTLS] (or some careful flavor of [B]) for a non-redirect..
> 

I think the biggest issue with [B] is that when you capture data from the path which \
is decoded and try to use this data in the new URL in path and query string at the \
same time. For putting it in the path B does the wrong thing, for putting it in the \
query string it does the correct thing. If you do not use B then the substitution in \
the path will be correct, but not in the query string. Apart from the additional B \
variants now available I guess a further internal rewritemap can help that unlike \
'int:escape' would consider using '+' instead of %20 to encode spaces. Of course if \
you capture data from the query string you have to fight with the other way round and \
need to ensure that you decode the data before putting it in the path as otherwise \
you probably have an issue with double encoding.

BTW: I guess if time is found we should reorganize our escaping / unescaping code. It \
seems to be widespread across the codebase to handle various special cases / edge \
cases and it seems to duplicate code in a lot of locations. I think it would be \
better if it could be concentrated in server/util.c and if needed add some helper \
functions that can be reused if we cannot move all special cases / edge cases there \
such that their handling uses a more common base. Another thought that I had from the \
performance point of view is if we could create an API that would allow to build \
tables similar to the test_char_table on the "fly" e.g. during init and let functions \
use these tables for effective character classification. I think this would be more \
effective for longer lists of characters then using apr_isalnum or strchr in loops \
again and again.

Regards

Rüdiger


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic