RE: XML Iterators
by Matt Sergeant other posts by this author
Jan 15 2002 9:08AM messages near this date
view in the new Beta List Site
RE: Big XML files (thanks)
|
Re: XML Iterators
> -----Original Message-----
> From: Adam Turoff [mailto:ziggy@[...].com]
>
> On Mon, Jan 14, 2002 at 06:24:29PM -0500, Barrie Slaymaker wrote:
> > At Matt's suggestion, there's a new machine that allows
> record oriented
> > processing of XML, XML::SAX::ByRecord, along with a supporting SAX
> > filter XML::Filter::DocSplitter. X::S::ByRecord is
> documented below,
> > feedback/testing/patches on any/all of this is quite welcome.
>
> I've been casually glancing at the list recently, and a couple of
> mentions of XML::RAX started me thinking.
>
> Is there a standard idiom for converting a SAX machine into an
> iterator? That is, something like this:
>
> my $xmlp = new XML::SAXIterator();
>
> ## set iterator options
> ## connect to a driver that produces SAX events
> ## use $xmlp to drive the entire process
>
> while (defined ($result = $xmlp->next())) {
> ## process the result object
> ## could be a tree fragment, a la DOM or XML::Twig
> ## could be some other type of data entirely
> }
>
> Writing event-based parsing code for lo these many years, my gut
> feeling is that this isn't possible with XML::Parser or the SAX model
> as it stands today.
>
> However, it would be with the addition of a pause() method and a
> next() method. The pause() method would be called within a handler
> stack, and tell the parser to pause it's dispatch loop; at the end
> of the current handler, it would exit from the dispatch loop,
> returning control to the main program. The next() method would
> start parsing if it hasn't begun yet, and would continue parsing
> from where the parser left off previously.
>
> A SAX parser would need to implement these methods, but SAX filters
> would be building up object instance data during the parse,
> which would
> be returned at the end of each iteration. Thus, the filter would be
> building up a twig, a Simple tree, a DOM-let, or some application
> defined structure (that's being unmarshalled from an XML stream).
>
> I *think* it's a simple hack to the SAX model. And I *think* there
> are some uses to iterator based processing, but I can't seem to
> come up with anything more than a variation of a record processor.
I think it would be good to discuss this on xml-dev (which I'm unsubscribed
from at the moment due to the noise level there), so that we can standardise
it for SAX for all languages. I'd certainly welcome a pause/resume system
(you wouldn't need next() if you had resume, I don't think).
Matt.
This e-mail has been scanned for all viruses by Star Internet. The service is powered by Mes
sageLabs. For more information on a proactive anti-virus service working around the clock, a
round the globe, visit: http://www.star.net.uk
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
http://listserv.ActiveState.com/mailman/listinfo/perl-xml
Thread:
Matt Sergeant
Adam Turoff
|