[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lyx-devel
Subject:    Progress report on a "pseudo command line" parser
From:       Angus Leeming <leeming () lyx ! org>
Date:       2004-08-27 16:56:22
Message-ID: cgnp24$4am$1 () sea ! gmane ! org
[Download RAW message or body]

I've been quiet recently because I've been slowly putting together a
"pseudo command line" parser that can be used to interpret the strings
that we use to define our converters from one format to another.

spawn_data const
parse_pseudo_command_line(std::string const & command_line,
                          std::string const & working_dir = std::string());

From the doxygen documentation:

/** @brief Parses a command line into an argument vector, in 
 *  much the same way as the shell.
 *
 *  @param command_line should conform to the shell-like syntax defined
 *  below.
 *
 *  @param working_dir is the current working directory that should be
 *  used when expanding any globs in @c command_line. If @c working_dir
 *  is an empty string, then the current working directory of the parent
 *  process is used.
 *
 *  @returns an appropriately-filled @c spawn_data variable if
 *  parsing is successful.
 *
 *  @throws invalid_command_line_syntax if parsing fails.
 *
 *  The syntax used is similar to that of a real shell and is defined
 *  formally below. Some, but not all, of the expansions that the
 *  shell would perform are supported. In particular, shell variables
 *  and globs are expanded. The major difference to a real shell is that
 *  this function requires the command line to be an invocation of
 *  a single process only. Command substitution and chaining of commands
 *  are not supported.
 *
 *  The parser handles quoted arguments:
 *
 *  @code
 *  "foo bar baz"  ->  [ "foo", "bar", baz" ]
 *  "foo\ foo'bar bar' \"baz baz\""  ->  [ "foo foobar bar", "baz baz" ]
 *  @endcode
 *
 *  It will expand shell variables in the same way as a Unix shell.
 *  Ie, variables are not expanded inside single-quoted blocks:
 *  An undefined shell variable is expanded to an empty string:
 *
 *  @code
 *  "$PATH '$PATH' \"$PATH\""  -> [ "/usr/bin", "$PATH", "/usr/bin" ]
 *  @endcode
 *
 *  It will expand globs in the same way as a Unix shell.
 *  Ie, globs are not expanded inside quoted blocks.
 *  A glob that does not match any file name is returned unchanged.
 *
 *  @code
 *  "*.pdf *.ps '*.ps' \"*.ps\"" ->
 *      [ "*.pdf", "foo.ps", "bar.ps", "*.ps", "*.ps" ]
 *  @endcode
 *
 *  It supports redirection of the standard streams:
 *
 *  @code
 *  "foo < in >> out 2>> err"
 *  "foo > out_and_err 2>&1"
 *  "foo 2> out_and_err 1>&2"
 *  @endcode
 *
 *  Following the example of GTK, the special file name "-" is used to
 *  indicate the null device (/dev/null on *nix, NUL: on Windows). Thus
 *  @code
 *  "foo 0<- 2>-"
 *  @endcode
 *  ensures that "foo" does not block awaiting input from the standard
 *  input stream and redirects output to the standard error stream to
 *  the null device.
 *
 *  @b Grammar
 *
 [ ... snip grammar ... ]
 */

The only thing that it doesn't yet do that I'd like it to do is parse UTF-8
encoded strings. I append below the output from the regression tests to
show its abilities.

Angus

$../../../bin/boost/libs/child/test/parse_pseudo_command_line.test/gcc/debug/parse_pseudo_command_line

Input command line:  foo bar
Output argv: [ 'foo', 'bar' ]

Input command line:  foo\ bar baz
Output argv: [ 'foo bar', 'baz' ]

Input command line:  'foo bar' baz
Output argv: [ 'foo bar', 'baz' ]

Input command line:  'foo \'bar\'' baz
Output argv: [ 'foo 'bar'', 'baz' ]

Input command line:  "foo bar" baz
Output argv: [ 'foo bar', 'baz' ]

Input command line:  "foo \"bar\"" baz
Output argv: [ 'foo "bar"', 'baz' ]

Input command line:  "foo \$bar" baz
Output argv: [ 'foo $bar', 'baz' ]

Input command line:  $HOME
Output argv: [ '/home/angus' ]

Input command line:  ${HOME}
Output argv: [ '/home/angus' ]

Input command line:  '$HOME'
Output argv: [ '$HOME' ]

Input command line:  "$HOME"
Output argv: [ '/home/angus' ]

Caught expected exception:

Error parsing:
${'HOME'}


Parsing failed at line 1, column 1.


$FOObar == Wow!
$BAR == bar

Input command line:  ${FOO${BAR}}
Output argv: [ 'Wow!' ]

Input command line:  "${FOO${BAR}}"
Output argv: [ 'Wow!' ]

Input command line:  '${FOO${BAR}}'
Output argv: [ '${FOO${BAR}}' ]

Input command line:  ls *.cpp
Output argv: [ 'ls',
'/home/angus/boost/cvs/libs/child/test/make_command_line.cpp',
'/home/angus/boost/cvs/libs/child/test/parse_pseudo_command_line.cpp' ]

Input command line:  ls "*.cpp"
Output argv: [ 'ls', '*.cpp' ]

Input command line:  ls '*.cpp'
Output argv: [ 'ls', '*.cpp' ]

Input command line:  ls ../*/*.cpp
Output argv: [ 'ls',
'/home/angus/boost/cvs/libs/child/test/../src/make_command_line.cpp',
'/home/angus/boost/cvs/libs/child/test/../src/spawn_data.cpp',
'/home/angus/boost/cvs/libs/child/test/../src/parse_pseudo_command_line.cpp',
'/home/angus/boost/cvs/libs/child/test/../src/glob_expansion.cpp',
'/home/angus/boost/cvs/libs/child/test/../test/make_command_line.cpp',
'/home/angus/boost/cvs/libs/child/test/../test/parse_pseudo_command_line.cpp'
]

Caught expected exception:

Error parsing:
ls `rm -fr $HOME` *.cpp
boost::filesystem::directory_iterator constructor: "": Success

Parsing failed at line 1, column 1.

Input command line:  foo < in
Output argv: [ 'foo' ]
stdin from file "in"

Input command line:  foo 0< in
Output argv: [ 'foo' ]
stdin from file "in"

Input command line:  foo > out
Output argv: [ 'foo' ]
stdout to file "out"

Input command line:  foo 1> out
Output argv: [ 'foo' ]
stdout to file "out"

Input command line:  foo >> out
Output argv: [ 'foo' ]
stdout appended to file "out"

Input command line:  foo 2>err
Output argv: [ 'foo' ]
stderr to file "err"

Input command line:  foo 2>>err
Output argv: [ 'foo' ]
stderr appended to file "err"

Input command line:  foo 1>&2
Output argv: [ 'foo' ]
stdout to stderr

Input command line:  foo 2>&1
Output argv: [ 'foo' ]
stderr to stdout

Input command line:  foo <-
Output argv: [ 'foo' ]
stdin from null device

Input command line:  foo 1>-
Output argv: [ 'foo' ]
stdout to null device

Input command line:  foo 2>-
Output argv: [ 'foo' ]
stderr to null device

Input command line:  0< in foo 1> out 2>>err
Output argv: [ 'foo' ]
stdin from file "in"
stdout to file "out"
stderr appended to file "err"


*** No errors detected


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic