Re: [xml-dev] The XML 1.1 Candidate Recommendation is published
by Amelia A Lewis other posts by this author
Oct 16 2002 12:59PM messages near this date
Re: [xml-dev] The XML 1.1 Candidate Recommendation is published
|
Re: [xml-dev] The XML 1.1 Candidate Recommendation is published
Hmm.
On Wed, 2002-10-16 at 06:56, Elliotte Rusty Harold wrote:
> C0 control characters such as form feed, vertical tab, BEL, and DC1
> through DC4 (whatever those are) are now allowed in XML text. However, they
> must be escaped as character references. They cannot be included literally in
> data. Nulls, thankfully, are still forbidden.
Why this is I don't understand. If you're allowing all sorts of control
characters, forced encoded, what difference would it make to allow a
null? Either the things stay safely encoded, in which case null is no
different than the other controls, or they don't, in which case null is
no different than the other controls.
> The C1 control characters such as BPH, IND, NBH, and PU1 are no longer
> allowed as literals in XML text. They too must now be escaped as character
I like this, in some ways. If controls are going to be allowed at all,
then they should be handled *somehow*, and encoding seems to be the
choice of the moment. I at least like the idea that C1 is to be treated
with the same disdain that C0 gets.
> references. For the first time this means that some well-formed XML 1.0
> documents are not well-formed XML 1.1 documents. The exception, of course, is
> IBM's holy grail of NEL, which will be allowed in literal XML text, just to
> make life difficult for every text editor on the planet except those from IBM
> mainframes.
Here, I get confused. I went and looked at the 1.1 spec. There's a
change to the discussion of line endings, which suggests that #xD #x85
and #x85 and #x2028 get normalized to #xA. Like #xD #xA or #xD followed
by anything else.
However, the production for S is not changed, so although these things
participate in line endings, they aren't space characters. Is that
correct?
If the answer is "it doesn't matter, line end processing happens before
checking for space," then the S production still ought to be changed
(for clarity), to remove #xD, which is as can't-appear in that situation
as any of the new bits. But it makes more sense to me that anything
considered to be part of a line ending ought to be listed in S, which
would become: #x9 #xA #xD #x20 #x85 #x2028. I don't understand the
inconsistency.
But the whole thing seems to be nearly as weird as the Namespaces 1.1
rec, which seems to think that because the only way to have no namespace
is to allow undeclaration of the default namespace, then named prefixes
also ought to be undeclared. Pure hobgoblin: foolish consistency.
Amy!
--
Amelia A. Lewis amyzing@[...].com alicorn@[...].com
The law, in its majestic equality, forbids the rich as well as the poor
to sleep under bridges, to beg in the streets, and to steal bread.
-- Anatole France, "Le Lys Rouge"
-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org> , an
initiative of OASIS <http://www.oasis-open.org>
The list archives are at http://lists.xml.org/archives/xml-dev/
To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>
Thread:
John Cowan
John Cowan
Rick Jelliffe
John Cowan
Rick Jelliffe
Rick Jelliffe
Rick Jelliffe
Rick Jelliffe
Richard Tobin
Tim Bray
Richard Tobin
Tim Bray
Richard Tobin
John Cowan
G. Ken Holman
John Cowan
Elliotte Rusty Harold
Amelia A Lewis
John Cowan
Richard Tobin
John Cowan
Amelia A Lewis
John Cowan
John Cowan
Rick Jelliffe
Karl Waclawek
Karl Waclawek
Karl Waclawek
Elliotte Rusty Harold
John Cowan
Jeni Tennison
John Cowan
Karl Waclawek
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
Daniel Veillard
Elliotte Rusty Harold
John Cowan
David Carlisle
John Cowan
David Megginson
Tim Bray
John Cowan
Daniel Veillard
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
Elliotte Rusty Harold
|