[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ast-developers
Subject:    [ast-developers] Re: Shell diagnistocs for faulty pattern in s=${x//~(E)pattern/x} or [[ $x ==~(E)pa
From:       Glenn Fowler <gsf () research ! att ! com>
Date:       2012-06-24 8:26:15
Message-ID: 201206240826.q5O8QFtU004836 () terra ! research ! att ! com
[Download RAW message or body]


On Sun, 24 Jun 2012 08:37:46 +0200 Roland Mainz wrote:
> On Sat, Jun 23, 2012 at 5:55 AM, Glenn Fowler <gsf@research.att.com> wrote:
> [snip]
> > if the pattern did have a syntax error regcomp() would report it to the caller
> > so reporting an error or not is not a regex issue

> I know... see below...

> > so it doesn't make sense to add something to ~(...) to check syntax
> > because regcomp() already does it by default (modulo the ast REG_LENIENT flag, which
> > is settable in ~(...))

> ... the point was that it would be very very useful if ksh93 would
> generate diagnostic messages like $ grep --strict ... # does for
> (half-)broken patterns in s=${x//~(E)pattern/x} or [[ $x ==~(E)pattern
> ]].
> One idea was to use an unused letter or symbol in ~(<modifer>) and use
> it from the shell side. Or have a global option (set -o patterndiag)
> or global variable...

> IMHO we need such a kind of diagnostic messages... as the case with
> the XML fragment parser showed it can be a horrible pain to find bugs
> there and any help in form of error messages returned from the regex
> engine would've saved us *DAYS* of digging around.

it would be the shell's responsibility to check ~(...) for "don't ignore regcomp() errors"
the current strmatch()/strgrpmatch() apis don't report regcomp() errors
because it was never the shell's job to do that
strmatch() calls strgrpmatch() which calls regcache() which does report regcomp() errors
so its not a simple coding change
and I think it would only make sense to check ~(...) at the front of the pattern
i.e., switching it on and off inside the pattern would require regex changes too

I would suggest a different approach:

define a new type Pattern_t with an assigment discipline function that does

	if	(( debug_patterns ))
	then	sgrep --strict "$pattern" < /dev/null || exit 1 # after sgrep patch applied!
	fi

then use it like

	Pattern_t really_complex_pattern='...'

	dummy=${foo//${really_complex_pattern}/dummy}

the sgrep overhead would be negligable if it were a builtin (even from -lcmdtst)

_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic