From debian-devel Thu Jul 31 14:51:47 2003 From: Josef Spillner Date: Thu, 31 Jul 2003 14:51:47 +0000 To: debian-devel Subject: Re: Bug#203498: ITP: decss -- utility for stripping CSS tags from an HTML page. X-MARC-Message: https://marc.info/?l=debian-devel&m=105966417824793 On Thursday 31 July 2003 11:27, Sam Hocevar wrote: > And HTML makes it even harder since very few pages are valid, but > that DeCSS utility uses only regexes anyway. Technically, using RegExps for CSS will not only become maintenance hell, but would also limit the usability of such a script for e.g. network transparency. If at all, the way to go would be to use a decent HTML parser library (khtml, gecko come to mind, even Python's htmlparser is not mature enough yet), which not only gives the (internal, external) stylesheet but all components of the DOM and whatnot, and use scripting facilities to modify this object, and dump the resulting modified object to e.g. stdout. 'HTML' and 'leightweight' will hardly fit together. Josef -- Play for fun, win for freedom. Hurd^H^H^H^HLinux-Info-Tag Dresden 2003: http://www.linux-dresden.de -- To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org