ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Re: Calling anyone working on a XML Schema module for Perl
by Robin Berjon other posts by this author
Oct 11 2002 9:40AM messages near this date
view in the new Beta List Site
Calling anyone working on a XML Schema module for Perl | Re: Calling anyone working on a XML Schema module for Perl
Emmet Caulfield wrote:
>  Robin Berjon wrote:
>   > I do have the source to two XML
>   > Schema implementations (in Java), both of which I understand (even
>   > though neither are validators).
>  
>  What does an XML Schema implementation do if it isn't a validator?

[warning: low on Perl content, just answering the question]

There are many things you can do with any schema language. At present I 
am (alas) only concerned with XML Schema, but this applies similarly to 
RelaxNG and Schematron. (ok, and DTDs if you insist).

Both the implementations I have at hand serve the same purpose, there 
are two because they take two radically different approaches to applying 
a schema to a document (for research purposes, we're looking for what 
works best).

Reading a schema will provide you with very specific knowledge of the 
class of documents that the schema describes. What you do with that 
information is up to you, no one forces you to validate :)

The minor things we use this for include generating documentation from a 
schema or making nice little graphics that represent automata. The more 
important part is generating files that contains the smallest possible 
amount of information needed to reproduce the original document, ie a 
content encoding that has strong compression capabilities.

There are many other things that you can do. The type information 
(coming from the schema, and decorating the document tree) is a real 
PITA if you don't need it, but it can become really useful when you need 
it. For instance, you can use that as a way to analyse a document and 
find interesting properties about it. I'll soon be using that kind of 
approach to analyse SVG documents in order to find out information about 
their streamability, indexability, and fragmentability. Using a type 
hierarchy allows me to look for "all descendants of integer", as opposed 
to having to manually list all the elements that may have integer 
content (of course it's more complex than that for SVG, but it's the 
general idea).

There are also applications such as "data binding", which is mostly 
about using a schema to generate classes that will provide ways to 
access and manipulate XML data (it's mostly data-orientated) without 
ever seeing any XML. I'm not personnally very interested in that, but it 
seems to be all the rage in certain circles. I guess there are many 
other things one could do.

Of course, if you encode or type-decorate the document using a schema, 
in about 99% of cases a side-effect is that you've validated it because 
if it weren't valid you'd have failed to understand it (I say 99% 
because there are cases in which if you optimise certain things away 
that you know amount to the same you don't check certain constraints).

>  I'm not attempting to be funny here

Don't worry, a lot of what sucks about XML Schema is that it took into 
account much more than simple validation bundled into a huge and barely 
usable language. If all you want is validation, RelaxNG and Schematron 
are wildly superior solutions (if only because you can understand how to 
use them in under two hours).

>   The only one I've ever
>  looked at is W3C XML Schema, rather than RELAX-NG, Schematron or any of
>  the gazillion other things that turn up when you google for "XML Schema".

You might want to look into those two options. Schematron has a Perl 
implementation -- XML::Schematron -- from Kip and Julien Quint is 
halfway through an XML::RelaxNG implementation. It looks like they'll be 
both available to Perl before XML::Schema is.

>  Wasn't there also a rumour of W3C XML Schema support finding its way
>  into libxml2 at some stage?

Yes, though it's still very early alpha (you can enable it at configure 
time if you want, but there's no Perl binding). XML::Xerces also has 
some support but I couldn't get it to work easily and I don't think it 
exposes more than a validation interface.

>  I must admit that I have been waiting for the appearance of a perl 
>  implementation to play with, and I'd probably be a good deal less 
>  ignorant if such a thing did exist.

Play with XML::Schematron right now then! It's an elegant and simple 
schema language, and pretty much perfect for documents imho.

-- 
Robin Berjon <robin.berjon@[...].fr> 
Research Engineer, Expway
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488

_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Randy J. Ray
Robin Berjon
Emmet Caulfield
Robin Berjon

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved