[prev in list] [next in list] [prev in thread] [next in thread] 

List:       perl-xml
Subject:    [ANOUNCE] XML::Filter::Dispatcher 0.46
From:       Barrie Slaymaker <barries () slaysys ! com>
Date:       2003-01-10 14:40:27
[Download RAW message or body]

...is out, featuring a correct, tested SYNOPSIS section and several
bugfixes in addition to major improvements in the API for building
hierarchies of objects / data structures (CHANGES file, below).

One thing that's always bugged me about coping with XML in Perl is that
my pretty XML data structures are hard to map in to perl structures
that don't exactly parallel the XML in structure or naming.  For
instance, if you're using objects for somethings and structures for
others.

To pick a small nit, XML often uses intrinsic names (like "db", "table" and
"column") for things because each structural element is named.  Here's
a stretch of some XML I use here to describe databases:

  <dbml
    xmlns="http://slaysys.com/DBML/1.0"
  >
    <db name="junkfood_taste_test">
      <table name="food">
        <column name="food_key" primary_index="true" />
        <column name="name" indexed="true" />
        ...
        <row>1,Munchems</row>
        <row>2,Jelly Babies
      </table>
      <table name="score">
        <column name="score_key"   primary_index="true"        />
        <column name="subject_key" foreign_key_table="subject" />
        <column name="food_key"    foreign_key_table="food"    />
        <columm name="value"  type="int" />
        <row>1,27,1,10</row>
        <row>2,27,2,0</row>
        <row>3,28,1,0</row>
        <row>4,28,1,10</row>
      </table>
      ...
    </db>
  </dbml>

Those names don't translate well in to the extrinsic names used in Perl
data structures (like "dbs", "tables", and "columns") because in Perl
it's common to leave the elements of a list unnamed and name the list:

    $dbs = [
        {
            name => "junkfood_taste_test",
            tables => [
                {
                    name => "food",
                    columns => [
                        {
                            name => "food_key",
                            type => "int",
                            ...
                        },
                        ...
                    ],
                    primary_key => [
                        {
                            name => "food_key",
                        },
                    ],
                    ...
                },
                ...
            ],
        },
        ...
    ],

Now, mind you, that's a trivial example, but I chose it to be
clear (so any unclarity is my fault, not the languages' ;).  Plus
there's the issue that a blindly converted XML document might be
missing some data fields and that means that you need to write
somewhat more defensive (and thus less clear) code:

    print "$_->{name}\n" for @{ $dbs || [] };

instead of:

    print "$_->{name}\n" for @$dbs;

To tackle something like this with XML::Filter::Dispatcher, I
use code like this (code simplified for presentation,
POD stripped):

    package DBML::SAXBuilder;

    use XML::Filter::Dispatcher qw( :xstack :general );
    @ISA = qw( XML::Filter::Dispatcher );

    use strict ;
    use DBML::Constants qw( dbml_ns_1_0 );

    sub new {
        my $self = shift->SUPER::new(
            @_,
            Namespaces => { dbml => dbml_ns_1_0 },
            Rules => [
                ## String values in elts and attrs
                'dbml:*|dbml:*/@*'  => [ 'string()' => sub { xset } ],

                ## Booleans
                '@dbml:unique|@dbml:indexed' => sub {
                    my $v = xvalue->{Value};
                    if    ( $v eq "true"  ) { xset "true" }
                    elsif ( $v eq "false" ) {             }
                    else {
                        die xvalue->{Name}, " is '$v', not 'true' or 'false'\n";
                    }
                },

                ## Handle container elts.
                ##
                ## Pluralize names of containers that can occur more than once.
                ##
                'dbml:*[*]' => sub { warn "Unexpected <$_[1]->{Name}>\n"; xadd {} },
                'dbml:table' => sub { xadd tables => {
                    db_name      => xpeek->{name},
                    name         => undef,
                    columns      => [],
                    primary_key  => undef,
                    indexes      => [],
                    foreign_keys => [],
                    rows         => [],
                }; },

                'dbml:db' => sub { xadd dbs => {
                    name   => undef,
                    tables => [],
                }; },

                'dbml:column' => sub { xadd columns => {
                    table_name     => xpeek->{name},
                    name           => undef,
                    type           => undef,
                    length         => undef,
                    nullable       => undef,
                    auto_increment => undef,
                }; },

                'dbml:row' => sub { xadd rows => {
                    table_name => xpeek->{name},
                    values     => [],
                }; },

                ## A <row> may contain <value> elements
                ## (perhaps translated from CSV by an upstream filter)
                'dbml:value' => [ 'string()'   => sub { xadd values => {
                    name => undef,
                    data => xvalue,
                }; }, ],

                'dbml:null' => sub { 
                    my $d = xpeek->{data};
                    warn "Discarding data: '$d' containing <null/>\n"
                        if defined $d && length $d;
                    xpeek->{data} = undef
                },

                ## Need some sort of repository for all these objects.  Use an []
                'dbml:dbml'      => sub { xpush [] },
                'end::dbml:dbml' => sub { 
                    my $self = shift;
                    $self->{DBs} = xpop;

                },
            ],
        );
    }

Note that I'm using plain Perl data structures, but could just as easily
be new()ing up objects and calling accessors.

xpeek(), xpop(), xadd(), and xset() all work on something called the
"xstack".  xadd() and xset() have default behaviors like populating
HASHes, ARRAYs and scalars, grabbing the LocalName of the current attr
or element as needed etc.  See the docs :).

This is slower than other converters, but yields the results *I* want.
YMMV.

- Barrie

==================================================================

    - Updated SYNOPSIS, it was badly out of date and tripped up
      t@tomacorp.com, as seen on perlmonks (thanks, Matt).
    - Get '@node:*' => [ 'string()' => sub {} ] working, along with
      other related expressions.

0.45 Fri Jan  3 14:56:23 EST 2003
    - Replace xset_fallthrough() with the much more flexible
      xrun_next_action().  The latter is a more informative name,
      too.
    - xset() now croaks when overwriting a defined value.
    - Added xoverwrite() to allow defined values to be overwritten.
    - Empty Rules lists now work (ie do nothing).
    - Unbuggered postponements a bit.  See t/postponements.t for a
      couple of known failures (commented out).

0.44 Tue Dec 31 10:40:20 EST 2002
    - Added bin/xfd_dump
    - xvalue now defaults to $_[1] (the sax data structure) if
      the rule was a matching expression.
    - added xvaluetype() (NEEDS TESTS!)
    - Handle default namespace more gracefully.  Added t/namespaces.t
    - implemented xset_fallthrough().

0.43 Tue Dec 31 00:48:16 EST 2002
    - Allow '@*' => [ 'string()' => \&foo ] rules to work.
    - Allow 'end::foo' rules to work
    - Add tracing support to xstack directives.
    - The xstack is now unwound after every non-start_ event.
    - start_element and end_element no longer accidently hide
      events from the xstack maintenance code.

0.42
    - Add XML::SAX::EventMethodMaker to PREREQ_PM
    - Added xadd and xset.

0.41 Fri Dec 20 09:54:02 EST 2002
    - Fix attribute ordering sensitivity on perl's hash algorithm.  This
      gets the test suite to pass and might help somebody somewhere's
      production code to operate in a predictable fashion across perl
      versions.
    - string( * ) now compiles (and works :)
    - get xstack synced with the order events.
    - add t/builder.t

0.4 Thu Dec 12 06:32:24 EST 2002
    - Major rewrite.  Now supports most of XPath plus EventPath goodies.

_______________________________________________
Perl-XML mailing list
Perl-XML@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic