[prev in list] [next in list] [prev in thread] [next in thread] 

List:       debian-devel
Subject:    Re: Bug#203498: ITP: decss -- utility for stripping CSS tags from an HTML page.
From:       Josef Spillner <josef () ggzgamingzone ! org>
Date:       2003-07-31 14:51:47
[Download RAW message or body]

On Thursday 31 July 2003 11:27, Sam Hocevar wrote:
>    And HTML makes it even harder since very few pages are valid, but
> that DeCSS utility uses only regexes anyway.

Technically, using RegExps for CSS will not only become maintenance hell, but 
would also limit the usability of such a script for e.g. network 
transparency.
If at all, the way to go would be to use a decent HTML parser library (khtml, 
gecko come to mind, even Python's htmlparser is not mature enough yet), which 
not only gives the (internal, external) stylesheet but all components of the 
DOM and whatnot, and use scripting facilities to modify this object, and dump 
the resulting modified object to e.g. stdout.
'HTML' and 'leightweight' will hardly fit together.

Josef

-- 
Play for fun, win for freedom.
Hurd^H^H^H^HLinux-Info-Tag Dresden 2003: http://www.linux-dresden.de


-- 
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic