ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xml-dev
xml-dev
Re: [xml-dev] Relax NG annoyances
by Robin Berjon other posts by this author
Jul 3 2003 10:11AM messages near this date
Re: [xml-dev] Relax NG annoyances | Re: [xml-dev] Relax NG annoyances
Hi Jeni,

Jeni Tennison wrote:
>  Robin Berjon wrote:
> >A fair number of vocabularies created before XML Schema or RelaxNG
> >have comma or semicolon separated lists. Another example could be
> >the list of commands in SVG path data. But as tempting as it is to
> >want to fix this with lists, I think that having a nice way of
> >declaring compound types (à la Regular Fragmentation, but without
> >changing the tree) would be the most general and elegant solution to
> >this.
>  
>  At his RELAX NG tutorial at XML 2002, John Cowan mentioned the
>  possibility of extending RELAX NG patterns into text content, so, for
>  example, to get pairs of numbers in which the numbers in a pair were
>  separated by commas and the pairs were separated by whitespace, you
>  might use something like:
>  
>  <define name="path">
>    <ref name="numberPair" />
>    <zeroOrMore>
>      <whitespace />
>      <ref name="numberPair" />
>    </zeroOrMore>
>  </define>
>  
>  <define name="numberPair">
>    <data type="decimal" />
>    <value>,</value>
>    <data type="decimal" />
>  </define>
>  
>  I have no idea whether this is an idea that's being pursued?

I've been working on something similar myself, on and off. I think that Simon 
and Eric's work on RegFrags[0] can be considered to have similar goals as well.

What I like about RegFrags is that they make implicit structure explicit by 
adding to the XML tree. That's convenient because it means that all downstream 
processors need to deal with is XML. What I dislike with them is the exact same 
thing, since sometimes I don't want my tree to be touched.

I think it's unrealistic to believe that people will create vocabularies where 
all structure is to be made explicit. XPath, SVG path data, CSS values... 
examples abound.

>  The argument that there should be a separate way of defining datatype
>  libraries, with RELAX NG schemas (and other technologies) just
>  referencing an appropriate one, seems persuasive.

Very much so.

>  A combination of the
>  datatype-oriented definitions ala XML Schema and regex-based
>  definitions, like the one above, seems pretty powerful. Presumably
>  this is something that Part 5 (Datatypes) of DSDL is addressing?

I haven't had time to follow DSDL as much as I would like to, but combining 
typing and regexen -- if only for composability -- seems to me to be a powerful 
mix. What I have been playing with is conversions from WXS types to regexen 
which are composed into into a single large regex which is then use to type 
subcomponents of a string. One advantage of this approach is that you can let 
the regex engine backtrack accross types, it makes life easier. A disadvantage 
is that it's possible to do silly things, such as (using the RNG syntax from above):

   <define name='prefixedInt'> 
     <data type='string'/> 
     <data type='int'/> 
   </define> 

which will be unlikely to do what the author wants unless the int is constrained 
to be in the 0-9 range or the string has a proper pattern ("foo1234" will yield 
{"foo123",4}, not {"foo", 1234}).


[0]http://www.simonstl.com/projects/fragment/

-- 
Robin Berjon <robin.berjon@[...].fr> 
Research Engineer, Expway        http://expway.fr/
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488


-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org> , an
initiative of OASIS <http://www.oasis-open.org> 

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl> 
Thread:
Jimmy Cerra
Bob Foster
Jeni Tennison
Rick Jelliffe
Jeni Tennison
Rick Jelliffe
Jeni Tennison
James Clark
Robin Berjon
Jeni Tennison
Robin Berjon
Amelia A. Lewis
Berend de Boer
=?ISO-8859-1?Q?Bill_de_h=D3ra?=

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved