[prev in list] [next in list] [prev in thread] [next in thread]
List: busybox
Subject: Re: [BusyBox] Re: [patch] Simplify the heck out of the sed newline
From: Rob Landley <rob () landley ! net>
Date: 2003-09-26 3:48:18
[Download RAW message or body]
On Thursday 25 September 2003 20:34, Glenn McGrath wrote:
> On Thu, 25 Sep 2003 15:26:59 -0500
>
> Rob Landley <rob@landley.net> wrote:
> > Here's a test program to show that glibc regex is matching patterns
> > containing newlines just fine. (I haven't tested uclibc regex, but if
> > it doesn't work and glibc does, that's a bug and I'll fix it.)
> >
> > int main(int argc, char *argv[])
> > {
> > regex_t regex;
> > regmatch_t match;
> > // char grepstr[]="fred[123]ish";
> > // char *string="abcfred2ishdef";
> > char *grepstr="oom\nping";
> > char *string="thingoom\npingb";
> >
> > printf("%d\n",regcomp(®ex, grepstr, REG_NEWLINE));
> > printf("%d\n",regexec(®ex,string,1,&match,0));
> > printf("%d %d\n",match.rm_so,match.rm_eo);
> > }
>
> Interesting, its definetly a bug in sed.c then, and the current newline
> hack is the wrong approach.
>
>
> Glenn
I'm banging on it.
I started reading the spec again from the beginning, which resulted in me
writing up a list of tests, which resulted in me writing sedtests.py, which
is a little python script that runs sed tests. Here's what I have so far...
I've ripped out the newline hack entirely in my tree, and I've got a test for
newline behavior. Right now I'm getting test cases from the spec, and late
(since I have to re-read a lot of the code anyway) I hope to get test cases
from the code. THEN, I intend to get test cases from the binutils build, and
from previous emails you've sent me.
I just changed a lot of code (simplying the heck out of the big sedding loop),
and I'll just about guarantee you I broke something. Time for a regression
test harness, then. :)
Rob
["sedtests.py" (text/x-python)]
#!/usr/bin/python
verbose=0
tests=(
# Testing address ranges
# Test one numeric address.
("a\nb\nc\n","-n", "b\n", "2p"),
# Test one regexp address (with two matches).
("a\nb\nc\nb\nd\n", "-n", "2\n4\n", "/b/="),
# Test $ address.
("a\nb\nc\n", "-n", "c\n", "$p"),
# Test numeric address range
("a\nb\nc\nd\ne\n", "-n", "a\nb\nc\n", "1,3p"),
# Test regexp pair address range
("a\nb\nc\nd\ne\n", "-n", "b\nc\nd\n", "/b/,/d/p"),
# Test regexp pair address range with two matches
("a\nb\nc\nd\nb\ne\nd\nf", "-n", "b\nc\nd\nb\ne\nd\n", "/b/,/d/p"),
# Test reversed numeric address range
("a\nb\nc\nd\ne\n", "-n", "b\nc\nd\ne\n", "/b/,/a/p"),
# Test regexp with numeric
("a\nb\nc\nd\ne\n", "-n", "b\nc\nd\n", "/b/,4p"),
# Test regexp with lower numeric
("a\nb\nc\nd\ne\n", "-n", "b\n", "/b/,1p"),
# Test reversed regexp pair (same as no second match)
("a\nb\nc\nd\ne\n", "-n", "c\nd\ne\n", "/c/,/b/p"),
# Test regexp and $
("a\nb\nc\n", "-n", "b\nc\n", "/b/,$p"),
# Test $ and regexp
("a\nb\nc\n", "-n", "c\n", "$,/b/p"),
# Test number and $
("a\nb\nc\n", "-n", "a\nb\nc\n", "1,$p"),
# Test $ and number
("a\nb\nc\n", "-n", "c\n", "$,2p"),
# Regular Expressions
# Regular expressions cah show up in addr1, addr2, and subst.
# Test backslash-escape delimited regexp range
("a\nb\nc\nd\ne\n", "-n", "b\nc\nd\n", "\\,b,,\\ d p"),
# Test backslash newline in substitution regexp.
("a\nb\nc\nd\n", "-n", "bfooc\n", "2{N;s/\\n/foo/p;}"),
# Test backslash newline in substitution target
("a\nb\nc\n", "", "a\n123\n456\nc\n", "s/b/123\\n456/"),
# Gnuisms
# Test if the last line of input has no newline that the last line
# of output has no newline. (Violates the spec: "Whenever the pattern
# space is written to standard output or a named file, sed shall
# immediately follow it with a <newline>."
("a\nb", "-n", "b", "2p")
# Notes:
# Edge case: you can't use an embedded newline as a backslash-escape
# delimiter. I.E. ("a\nb\nc\n", "-n", "b\n", "\\\\nb\\np") fails.
# (Because scan of pattern space parses \\ as escaped backslash.)
# No human being should ever care. Don't do that then.
)
import os, sys
def shell(command, stdin="", discardstderr=0):
"""Shell out and capture stdout input."""
io=os.popen3(command)
if len(stdin): io[0].write(stdin)
io[0].close()
retval=io[1].read()
io[1].close()
err=io[2].read()
if not discardstderr: sys.stderr.write(err)
return retval
count=0
for i in tests:
count=count+1
command="sed %s -e '%s'" % (i[1],"' -e '".join(i[3:]))
result=shell(command,i[0])
fail=(result!=i[2])
sys.stdout.write("test %s: " % count)
if verbose or fail:
sys.stdout.write("\ndata:\n%s\ncommand: %s\n" % (i[0],command))
sys.stdout.write("expected:\n%s" % i[2])
sys.stdout.write("result:\n%s" % result)
if fail: print "Fail"
else: print "Pass"
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic