ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
RE: XML Iterators
by Matt Sergeant other posts by this author
Jan 15 2002 9:08AM messages near this date
view in the new Beta List Site
RE: Big XML files (thanks) | Re: XML Iterators
>  -----Original Message-----
>  From: Adam Turoff [mailto:ziggy@[...].com]
>  
>  On Mon, Jan 14, 2002 at 06:24:29PM -0500, Barrie Slaymaker wrote:
>  > At Matt's suggestion, there's a new machine that allows 
>  record oriented
>  > processing of XML, XML::SAX::ByRecord, along with a supporting SAX
>  > filter XML::Filter::DocSplitter.  X::S::ByRecord is 
>  documented below,
>  > feedback/testing/patches on any/all of this is quite welcome.
>  
>  I've been casually glancing at the list recently, and a couple of
>  mentions of XML::RAX started me thinking.
>  
>  Is there a standard idiom for converting a SAX machine into an
>  iterator?  That is, something like this:
>  
>  	my $xmlp = new XML::SAXIterator();
>  
>  	## set iterator options
>  	## connect to a driver that produces SAX events
>  	## use $xmlp to drive the entire process
>  
>  	while (defined ($result = $xmlp->next())) {
>  		## process the result object
>  		## could be a tree fragment, a la DOM or XML::Twig
>  		## could be some other type of data entirely
>  	}
>  
>  Writing event-based parsing code for lo these many years, my gut
>  feeling is that this isn't possible with XML::Parser or the SAX model
>  as it stands today.
>  
>  However, it would be with the addition of a pause() method and a
>  next() method.  The pause() method would be called within a handler
>  stack, and tell the parser to pause it's dispatch loop; at the end
>  of the current handler, it would exit from the dispatch loop,
>  returning control to the main program.  The next() method would
>  start parsing if it hasn't begun yet, and would continue parsing
>  from where the parser left off previously.
>  
>  A SAX parser would need to implement these methods, but SAX filters
>  would be building up object instance data during the parse, 
>  which would
>  be returned at the end of each iteration.  Thus, the filter would be
>  building up a twig, a Simple tree, a DOM-let, or some application 
>  defined structure (that's being unmarshalled from an XML stream).
>  
>  I *think* it's a simple hack to the SAX model.  And I *think* there
>  are some uses to iterator based processing, but I can't seem to 
>  come up with anything more than a variation of a record processor.

I think it would be good to discuss this on xml-dev (which I'm unsubscribed
from at the moment due to the noise level there), so that we can standardise
it for SAX for all languages. I'd certainly welcome a pause/resume system
(you wouldn't need next() if you had resume, I don't think).

Matt.

This e-mail has been scanned for all viruses by Star Internet. The service is powered by Mes
sageLabs. For more information on a proactive anti-virus service working around the clock, a
round the globe, visit: http://www.star.net.uk
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
http://listserv.ActiveState.com/mailman/listinfo/perl-xml
Thread:
Matt Sergeant
Adam Turoff

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved