ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
XML::AutoWriter, XML::Filter::Validator?
by Barrie Slaymaker other posts by this author
Jul 25 2000 12:58PM messages near this date
view in the new Beta List Site
[ANNOUNCE] XML::ValidWriter, XML::AutoWriter 0.1 | Re: XML::Twig '0' PCDATA erased?
[[CCed back to perl-xml for general commentary]]

Laurent CAPRANI wrote, in part:
>  
>  1. SAX handler
>  
>  I don't need personally SAX handler methods. Only the startTag() / endTag()
>  interface may be useful for me, since I will surely override them.

I misunderstood, then: I thought you wanted to use your existing
SAX driver chain and bolt on an XML::ValidWriter::SAX handler
object.

>  2. SAX driver
>  
>  Well, I understand you may feel reluctant toward PerlSAX design. I would surely
>  agree with any attempt to improve PerlSAX performance.

I'm reluctant to force SAX where it's not a good fit, is all.  SAX is
a Good Thing (tm), and I also want to be SAX compliant.  Probably
by refactoring the code a bit in to XML::Validator and XML::Validator::SAX
or XML::Filter::Validator.  Naming suggestions welcome.  It's not
going to happen immediately, though, too much else to do.

>  3. Specifying autotag attributes

The current this works is:

   # AutoWriter subclasses ValidWriter and provides autotagging
   $writer = XML::AutoWriter-> new(
      DOCTYPE =>  new XML::Doctype( 'foo', SYSTEM_ID => 'fooml.dtd' ) 
   ) ;
   ## fooml.dtd contains <!ATTLIST foo a1 CDATA #REQUIRED > .  fooml.dtd
   ## may or may not have an <!ELEMENT foo ... >  in it.
   $writer-> getDoctype->element_decl('foo')->default_on_write('value') ;

Haven't done anything about callbacks.

>  4. Specifying the autotag path
>  
>  I suppose that pre-code() gets called when startTag(<c>) is called within an
>  <a>. It allows the user to specify the path (here <b>).

Yup.  It would be a pattern spec w/ callbacks that allow you to emit
preamble, alternative content, and/or postamble.

>  My idea was to specify more automation through the DTD side. For example a
>  special kind of attribute for path selection.
>  For example, the DTD allows a <P> inside <BODY>, inside a <TABLE><ROW><CELL> or
>  inside a <LIST><ITEM>.
>  The "extended" DTD would require an additional qualifier for <P> (similar to a
>  required attribute), to select the right path.

If I understand, you want to call something like

   $writer-> startTag( 'P', NESTED_IN => 'TABLE' ) ;

.  Right now, you can hardwire path selection in the subclass, as above,
or by calling startTag() with the desired intermediate tag:

   $writer-> startTag( 'TABLE' ) ;
   $writer-> startTag( 'P' ) ;

or, if you're using the functional interface,

   TABLE ;
   P ;

.  An example lies below.  How does having nesting specified as an attribute
(if I've understood correctly) help?

>  5. SGML-ish things
>  
>  I thought that inhibiting some autotagging may facilitate debugging and provide
>  "default" path.

Would something like

   $writer-> getDoctype->element_decl( 'foo' )->autoTagging( 0 ) ;

be Ok to shut it off?  It doesn't 'extend' the DTD syntax, but it would work.

It would cut off the autotag search whenever an element with this set to
TRUE was encountered.

>  The XSLTish thing looks promising. I should try it before requesting special
>  features on the DTD side.
>  
>  Just an idea: XSLT-ish patterns could be extended to trigger callbacks on
>  character data (inserting Perl regexps into patterns?).

Interesting thought. > >TODO

>  6. Conditional tags
>  
>  Converters from flat formats need to say "open this element if it is not
>  already opened" and "close this element unless it is already closed" and they
>  use it a lot.

The autotagger does something like that, but does not allow you to explictly
conditionally open a tag (example follows).  Here're some possible OO & 
functional APIs, what do you think?

1)
   $writer-> startTag( 'P') unless $writer->tagOpen( 'P' ) ;
   $writer-> endTag( 'P')   if     $writer->tagOpen( 'P' ) ;

   P unless tagIsOpen( 'P' ) ;
   end_P if tagIsOpen( 'P' ) ;

2)

   $writer-> ensureStartTag( 'P' ) ;
   $writer-> ensureEndTag(   'P' ) ;

   ensure_P ;
   ensure_end_P ;

3)

   $writer-> assertStartTag( 'P' ) ;
   $writer-> assertEndTag(   'P' ) ;

   assert_P ;
   assert_end_P ;

4)

   $writer-> condStartTag( 'P' ) ;
   $writer-> condEndTag(   'P' ) ;

   condStartTag( 'P' ) ;
   condEndTag(   'P' ) ;

.  Here's a toy example to help illustrate how it behaves now:

[barries@jester XML-DocType]$ make pure_all ; perl toy
<?xml version="1.0"?> 
<HTML> 0<TABLE><TR><TD><P>a</P><P>bc</P></TD></TR></TABLE></HTML>
######################################################################
#!/usr/local/bin/perl -w

use XML::Doctype    NAME =>  'HTML', DTD_TEXT => <<TOHERE ;
   <!-- HTML is undefined and this it's cm is assumed to be 'ANY' --> 
   <!ELEMENT TABLE ( TR )* > 
   <!ELEMENT TR    ( TD )* > 
   <!ELEMENT TD    ( P )*  > 
   <!ELEMENT P     (#PCDATA) > 
TOHERE

use XML::AutoWriter qw( :all :dtd_tags ) ;

xmlDecl ;
characters( '0' ) ;
   TABLE ;
         characters( 'a' ) ;
      P ;
         characters( 'b' ) ;
         characters( 'c' ) ;
   endAllTags ;

open( ME, "<$0" ) or die $! ;
print "\n", "#" x 70, "\n", <ME>  ;

>  There must be some way to provide this. It might be a query on open elements.

It's pretty easy to do that, and will be easier if I factor XML::Validator out
of XML::ValidWriter.

- Barrie

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved