[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: Regexp stuff is KJS.
From:       Michael Bedy <mjbedy () mediaone ! net>
Date:       2001-01-31 1:35:03
[Download RAW message or body]



On Tue, 30 Jan 2001, Harri Porten wrote:

> Michael Bedy wrote:
> > 
> >   Well, I have looked over the regexp stuff, and there are a few ways to
> > proceed, as I see it..
> > 
> >    1) Write a complete regexp package.
> [...]
> >    2) Rip the regexp stuff out of glibc and hack away.
> 
> I always was under the impression that we just have to map a few
> functions (e.g. for retrieving captures) and that's all. What exactly is
> missing ? It would really be a dumb move by the authors of the spec if
> they require functionality not being provided by regular system libs.
> 

  The spec says it's "modelled after the regular expression facility in
the perl 5 programming language."

> >    3) Write a "preprocessor" that converts a Javascript regexp into an
> >       POSIX one. Then use the POSIX stuff as it does now.
> 
> That's the way *I* intended to go. Under the assumption that the
> differences would be rather minor, of course. Even if - let's say - 5%
> of the features can't be done that way I would simply skip them unless
> they are proven to be used in real world web pages.
> 

   Oh, I don't WANT to write a regex package. I've got a good idea how
much work that would be.

   The POSIX spec (at least as presented by the documents I have found on
the web) has several significant differences from JS. As an example, JS
allows things like "\w" which matches any "word" character. POSIX has
[:alnum:], which I think means the same thing.

   So far, I have determined that at least one (evil) feature of JS just
can't be done with POSIX (at least, not at the same time as tons of more
interesting stuff): backreferences.

   One additional option: GLIBC has a seperate interface to it's regular
expression stuff, and it may provide almost all of the constructs
required. Pros: easy, Cons: only woks on GLIBC systems.

   Or, another (heavy) option is to link to libperl and use it, since JS
regex is taken pretty much word for word from perl.

     - Mike

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic