Re: [xml-dev] Separation of Concerns (was Re: [xml-dev] The XML 1.1 Candidate Recommendation is published)
by Karl Waclawek other posts by this author
Oct 16 2002 2:58PM messages near this date
Re: [xml-dev] The XML 1.1 Candidate Recommendation is published
|
Re: [xml-dev] The XML 1.1 Candidate Recommendation is published
> From: "Karl Waclawek" <karl@[...].net>
>
> > I am sure there will be (or are) generic libraries for that kind of
> > Unicode processing. To me this looks as if there is no proper
> > "separation of concerns", i.e. an XML processor should not concern
> > itself with the issue of normalization.
>
> Two comments
>
> 1) Character, encoding and normalization issues are simply too
> hard for programmers to do.
That's why you don't it yourself, but use libraries for
Unicode string comparison, etc. It is an old hat, for instance,
that you can't always perform binary comparison of strings,
that was true even before Unicode.
> XML provides the only real
> gateway where these things can be handled transparently,
> to shield the programmer from having to be aware of them,
> (to a great extent.)
Only for XML applications. What about the other applications?
People are still writing non-XML applications...
And what if the definition of normalization changes?
Then you have to update it int two places, your generic
Unicode libraries, and all XML processors that have it.
> It is a spurious "separation of concerns"
> to rely on layers that don't exist, IYSWIM.
If Unicode layers don't exist yet (to some degree they do!),
then they sure will exist in the near future.
> 2) When I originally added normalization to opening XML
> files for a product, I found it slowed things down a lot
> (more than transcoding.) But I soon found that just by
> adding a small test to see if my data was all < U+300
> (and therefore I didn't need to use the bulkier normalization
> routines) it becomes insignificant for most Western documents.
> So even though checking for normalization may add=20
> slight complexity to parsers, it may not have any significant
> performance impact, except on documents containing characters
> where normalization may be important.
What parser are you using?
In high performance parsers like Expat this sure makes a difference.
Just changing the calling convention in Expat can make a 10%
speed difference.
But performance isn't the main issue, IMO.
Karl
-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org> , an
initiative of OASIS <http://www.oasis-open.org>
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>
Thread:
John Cowan
John Cowan
Rick Jelliffe
John Cowan
Rick Jelliffe
Rick Jelliffe
Rick Jelliffe
Rick Jelliffe
Richard Tobin
Tim Bray
Richard Tobin
Tim Bray
Richard Tobin
John Cowan
G. Ken Holman
John Cowan
Elliotte Rusty Harold
Amelia A Lewis
John Cowan
Richard Tobin
John Cowan
Amelia A Lewis
John Cowan
John Cowan
Rick Jelliffe
Karl Waclawek
Karl Waclawek
Karl Waclawek
Elliotte Rusty Harold
John Cowan
Jeni Tennison
John Cowan
Karl Waclawek
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
Daniel Veillard
Elliotte Rusty Harold
John Cowan
David Carlisle
John Cowan
David Megginson
Tim Bray
John Cowan
Daniel Veillard
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
|