ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
RE: What's So Great about SAX? (ie. Future Indecisions)
by Grant McLean other posts by this author
Oct 7 2002 11:03PM messages near this date
view in the new Beta List Site
Re: Generating PDF Files Dynamically | Re: What's So Great about SAX? (ie. Future Indecisions)
From: Morbus Iff [mailto:morbus@[...].com]
>   >SAX is a huge advance over the XML::Parser Handler API for a
> =20
>  Thanks for the expose. I'll wax a bit.
> =20
>   > - pluggable - if your code is written to the SAX API you
>   >   can use any SAX parser without changing your code
> =20
>  Not immediately useful to me, since expat is the only library=20
>  currently=20
>  ported (and bundle-able) to every OS I need it to be in (my=20
>  software is one=20
>  of the rare few that turns into a "don't need Perl installed"=20
>  application=20
>  for Mac and Windows).

Note also that XML::SAX comes with an extremely portable parser
written entirely in Perl (XML::SAX::PurePerl).  Unfortunately,
it needs Perl 5.8 to support encodings other than UTF8

>   > - flexible - your data source does not even need to be an
>   >   XML document (eg: you can drive your SAX pipeline from
>   >   a database query
> =20
>  That's kinda neat, although I do some pre-processing before=20
>  sending to=20
>  XML::Simple - enough so that I'm always sending a string of=20
>  XML, not a file.
> =20
>   >I'm not entirely clear on what you're trying to do with
>   >namespaces.  Do you want your hashref keys to be in Clarkian
>   >notation eg: '{http://purl.org/dc/elements/1.1/}date' or
>   >do you want to normalise the prefixes used eg: 'dc:date'?
> =20
>  Nope, not really. The biggest problem is:
> =20
>    - I assume my data is going to be in one data structure, but
>      if someone prefixes the data with a namespace besides the
>      implied default, I get a different structure that breaks
>      my assumption:
> =20
>       assuming:
>        <item><dc:title>boo</dc:title></item>
>        =3D=3D $item->{dc:title}
> =20
>       breaks my thingy:
>        <item><dublincore:title>boo</dublincore:title></item>
>        !=3D $item->{dc:title} but rather $item->{dublinecore:title}

That does look like the normalisation option I referred to.
So if you got a document like this:

<rdf:RDF
 xmlns=3D"http://purl.org/rss/1.0/"
 xmlns:rdf=3D"http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:theonetruedublincore=3D"http://purl.org/dc/elements/1.1/" > 
  <theonetruedublincore:date> 2002-10-08</theonetruedublincore:date>
</rdf:RDF> 

then you want to treat it as if it were:

<rdf:RDF
 xmlns=3D"http://purl.org/rss/1.0/"
 xmlns:rdf=3D"http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:dc=3D"http://purl.org/dc/elements/1.1/" > 
  <dc:date> 2002-10-08</dc:date>
</rdf:RDF> 

and slurp it into a hash like this:

 {
   'xmlns'     =3D>  'http://purl.org/rss/1.0/',
   'xmlns:rdf' =3D>  'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
   'xmlns:dc'  =3D>  'http://purl.org/dc/elements/1.1/',
   'dc:date'   =3D>  '2002-10-08'
 };

The way I'd see that working with SAX is something like:

  use XML::SAX::Machines qw( :all );
  use XML::Filter::NSNormalise;
  use XML::Simple;

  my $p =3D Pipeline(
    XML::Filter::NSNormalise-> new(
      map =3D>  {=20
        'http://purl.org/dc/elements/1.1/' =3D>  'dc',
        'http://purl.org/rss/1.0/modules/syndication/' =3D>  'syn'
      }
    )
    =3D>  XML::Simple->new(
      keyattr =3D>  {}
    )
  );

  my $ref =3D $p-> parse_uri('./rss.xml');

An off-the-cuff version of XML::Filter::NSNormalise is attached.

Cheers
Grant
Attachments:
NSNormalise.pm

Thread:
Grant McLean
Robin Berjon

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved