ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xml-dev
xml-dev
Re: Why the Infoset?
by Rick JELLIFFE other posts by this author
Aug 1 2000 12:45PM messages near this date
Re: Why the Infoset? | Re: Why the Infoset?
Sean McGrath wrote:
>  
>  [Rick Jelliffe]
>  >
>  >That there can be several different lexical forms in XML for the same
>  >information item
>  >allows one to use text-based tools such as UNIX tools.   (The one I
>  >recommend is
>  >always to keep markup and data for titles and searchable strings on a
>  >single line, so that greps will work.)
>  
>  But the fact that the "same item" can have so many different lexical
>  forms means that getting the right answer every time necessitates
>  parsing the XML.

Err, yes. If you parse some text as a series of lines you will get one
result, if you parse it as XML you will get another, and if you parse it
as C yet another.  So what?

>  >The infoset lets people know what information will be in the parsed XML,
>  >regardless of
>  >which lexical form was used.
>  >
>  
>  All forms of accurate XML data processing - even the dumbest
>  lexical processing - involve parsing of some form. The idea
>  that
>          non-parsed <--> parsed
>  are two ends of an extreme with clear blue water in
>  between does not seem right to me.

I would agree, but in discussing XML processing, 
"parsed" is short-hand for "parsed-as-XML" and "non-parsed" is shorthand
for "not parsed-as-xml".  There is no need for me to write "what
information will be in the parsed-as-XML XML": in the context of talking
about XML processing what is meant by "parsed XML" should be obvious.

The XML Infoset is not a set of categories determined by science or
nature, it is a policy document derived by engineering and negotiation
which identifies and grades the various kinds of information that a
parsed XML document has, for use in various W3C specs. Having an infoset
spec gives spec-making groups an indication of what mainstream
requirements are: for example, "should the DOM report line-numbers?" is
an example of something that the infoset could help in.

Rick Jelliffe

Rick Jelliffe
Thread:
Paul W. Abrahams
Rick JELLIFFE
W. E. Perry

Jonathan Borden
Simon St.Laurent
Jonathan Borden
Simon St.Laurent
John F. Schlesinger
Jonathan Borden
Simon St.Laurent
W. E. Perry
John Cowan
Rick JELLIFFE
Rick JELLIFFE
Sean McGrath
Simon St.Laurent
Jonathan Borden
Sean McGrath
Rick JELLIFFE
Rick JELLIFFE
Simon St.Laurent
James Robertson
Simon St.Laurent
Jonathan Borden
Simon St.Laurent
Paul W. Abrahams
Jonathan Borden
Paul W. Abrahams
Rick JELLIFFE
Dan Vint
Rick JELLIFFE
Marcus Carr
Michael Champion
John Cowan
John Cowan
John Cowan
Michael Champion
Winchel 'Todd' Vincent, III
John Cowan
Jonathan Borden
sam th
Jonathan Borden

Simon St.Laurent
John Cowan
John Cowan
John Cowan
Simon St.Laurent
Richard Lanyon
John Cowan
Jonathan Borden
John Cowan
Simon St.Laurent
John Cowan
Jonathan Borden
Rick JELLIFFE
james anderson
Winchel 'Todd' Vincent, III
Winchel 'Todd' Vincent, III
Rick JELLIFFE

Norman Walsh
Jonathan Borden
Winchel 'Todd' Vincent, III
Jonathan Borden
Norman Walsh
Winchel 'Todd' Vincent, III
Amy Lewis

Eric Bohlman

John Cowan
Simon St.Laurent
Jeff Greif
Jonathan Borden
Elliotte Rusty Harold
Sean McGrath
Simon St.Laurent
Joe English
Simon St.Laurent
Jonathan Borden
Simon St.Laurent
W. E. Perry
Jonathan Borden
John Cowan
John Cowan
Sean McGrath
W. E. Perry
John F. Schlesinger
Sean McGrath
Michael Champion
Michael Champion
Paul W. Abrahams
John Cowan
Paul W. Abrahams
Paul W. Abrahams
Simon St.Laurent
Martin Gudgin
Jonathan Borden
Simon St.Laurent
Tim Bray
Jonathan Borden
Jack Rusher
Steve Rowe

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved