[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xml-dev
Subject:    Re: [xml-dev] More on taming SAX (was Re: [xml-dev] ANN: Amara XML Toolkit 0.9.0)
From:       Alan Gutierrez <alan-xml-dev () engrm ! com>
Date:       2004-12-26 20:00:08
Message-ID: 20041226200008.GA27205 () maribor ! izzy ! net
[Download RAW message or body]

* Uche Ogbuji <uche.ogbuji@fourthought.com> [2004-12-26 13:12]:
> Alan Gutierrez wrote:
> 
> >* Jeff Rafter <lists@jeffrafter.com> [2004-12-23 13:43]:
> > 
> >
> >>>While on the topic of SAX taming features in Amara, there is also
> >>>amara.saxtools.xpattern_sax_state_machine, which I didn't even bother
> >>>mentioning in the announcement (too much to cram in).
> >>>     
> >>>
> >>Can you expand on your expansion? As I was reading this I was thinking 
> >>that in the Java/C# world an interesting approach would be to keep a 
> >>pseudo DOM stack for the event hierarchy. Maybe something where you keep 
> >>everything at an ancestral level intact while parsing
> >>
> >>
> >><foo>
> >> <bar1>
> >>   <baz1/>
> >>   <baz2/>
> >> </bar1>
> >> <bar2>
> >>   <baz1>
> >>     <sub/>
> >>   </baz1>
> >>   <baz2>text</baz2>
> >> </bar2>
> >></foo>
> >>
> >>So when the event stream reached /foo/bar2/baz2/text() you would have 
> >>the following in a DOM like structure:
> >>
> >> foo
> >>   \
> >>    bar1 (... no children)
> >>    bar2
> >>      \
> >>       baz1 (... no children, just the previous sibling and attrs)
> >>       baz2 (only the StartTag)
> >>
> >>I am not sure that the preceding siblings would be very useful and have 
> >>more chances for pathological cases but when I construct mini-trees this 
> >>is the subset I find handy. It is useful when working with an editor to
> >>understand the immediate context. Unfortunately by requiring the 
> >>previous siblings you have to maintain quite a bit more... the whole 
> >>preceding branch of the tree.
> >>   
> >>
> >
> >   I have a SAX library (in Java) that keeps the stack around, but
> >   not the preceeding siblings. It is quite useful.
> >
> >   It is, actually, very useful to keep a stack around that has a
> >   hash table for each level of the stack, it allows for the
> >   devleopment of strategies that are themselves stateless.
> >
> >   Adding the implied stack goes a long way to make SAX event
> >   processing a more practical solution for a lot of problems.
> > 
> >
> 
> Yes.  This is a useful technique I covered for Python in my article 
> "Location, Location, Location 
> <http://www.xml.com/pub/a/2004/11/24/py-xml.html>":

> http://www.xml.com/pub/a/2004/11/24/py-xml.html

> I think that while useful this technique can still leave a lot of state 
> wrangling to the programmer, which is why Amara has several modules that 
> go further.

    Yes. A lot is still left to the programmer with my tool set, but
    it does pick up a lot common SAX tasks.
    
    I've wondered about what more I could do.

    Hmm.. Read the article. I was talking about how I keep a stack
    of the elements around, and how a silly thing I did turns out to
    be very useful. In the stack of events, for each event, I keep a
    java.util.Map and tuck all sorts of things in there.

    Twice now I've create a little langauge in XML and used SAX to
    parse it. Once I understood what I could and could not do, it
    got pretty easy to express a chore as an XML event stream. It
    was easy to keep track of the chore by tucking state into the
    java.util.Map. Kinda Perlish, but that's me.

    I was wondering if I couldn't specify some of those invidual
    chores within an XML Schema document. When a certian object is
    found in the event stream, acording to XML Schema, Java source
    could be executed, perhaps as a generated class with member
    variables mapped to attributes or the values of childen.

    I've thought about using an XPath tracker in error reporting to
    my library, which would be very simple to add at this point, and
    it's necessary, I think because the document locator loses
    meaning when I chain together a bunch of SAX filters.

    In any case, I'm reading through some of the other articles
    you've been posting. This is a very interesting discussion.

    Cheers.

--
Alan Gutierrez - alan@engrm.com

-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://www.oasis-open.org/mlmanage/index.php>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic