[prev in list] [next in list] [prev in thread] [next in thread] 

List:       php-internals
Subject:    Re: [PHP-DEV] Disabling External Entities in libxml By Default
From:       Rowan Collins <rowan.collins () gmail ! com>
Date:       2015-07-30 21:53:02
Message-ID: 0C5C85D4-1B7F-4533-9DB0-6FBC42AA048D () gmail ! com
[Download RAW message or body]

On 30 July 2015 21:35:01 BST, Rob Richards <rrichards@cdatazone.org> wrote:
> On 7/30/15 10:30 AM, Rowan Collins wrote:
> > Rob Richards wrote on 30/07/2015 14:12:
> > > If you are already working with a trusted document then you should 
> > > safely be able to disable the entity loader. If you aren't then 
> > > wouldn't you want to do some sort of checking (especially if you
> dont 
> > > have an XML gateway fronting the system) for other malicious things 
> > > before even opening the document regardless if it has external 
> > > entities or not.
> > 
> > Can you give any pointers to what kind of checking this would be, and
> 
> > how it would be carried out without parsing the XML document in the 
> > first place?
> > 
> > According to the bug report, one of the affected uses is the 
> > SoapClient, which by definition is dealing with remote data. I can
> see 
> > how that could be considered "untrusted", but I can't think of any 
> > particular action that would make it more trusted (quite apart from 
> > the lack of an obvious point to intercept the data before it is
> parsed).
> > 
> > Would it not make more sense for the parser to operate in an 
> > "untrusted" mode - disabling external entities, maybe different
> limits 
> > on stack depth, etc?
> > 
> > Regards,
> 
> All depends upon what you are trying to accomplish as this covers tree,
> 
> streaming, different types of schemas, xsl, etc...
> For example, you can easily check if there is a DTD, imports/includes, 
> specific xslt functionality, list goes on and on without ever having to
> 
> load the document. There really is no one size fit all imo so what one 
> considers untrusted someone else would consider trusted.

So effectively we should all write partial XML parsers to determine the contents of \
the file, in order to decide if it's the data we expected? Would it not make more \
sense to leave that to the XML library, with a whitelist of features we actually \
need, URLs we trust for includes, etc? I never want an XML file to execute system \
commands on my behalf; do I have to write a regex to make sure they don't?

Regards,
-- 
Rowan Collins
[IMSoP]


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic