ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Re: Re: XML::Parser & "invalid character"
by Duncan Cameron other posts by this author
Mar 29 2002 10:21PM messages near this date
view in the new Beta List Site
Auto Reply to your message ... | Re: Re: XML::Parser & "invalid character"
On 2002-03-29 Jenda Krynicky wrote:
> From: Duncan Cameron <dcameron@[...].uk>
> > On 2002-03-29 Jenda Krynicky wrote:
> > >I'm using XML files to replicate some settings and other stuff 
> > >between several servers (thanks for your previous help!). Everything
> > >is cool except one thing.
> > >
> > >I use character with code 2 as a separator or marker on several
> > >places in the database.
> > >the problem is that if I write a string containing a chr(2) into an
> > >XML file, XMP::Parser (used via XML::Simple) will refuse to parse the
> > >file
> 
> > XML doesn't allow such a character value, see the XML character
> > definition http://www.w3.org/TR/2000/REC-xml-20001006#charsets
> 
> Aaaagrrrrrrr. Someone thought they're clever ...
> 
> Who would it hurt if the parsers allowed &#anynumber; ?
> 
> > >And if not what character would you recomend to be used 
> > >(escaped if necessary) to make XML::Parser happy, but still being
> > >reasonably safe that it will not be mistaken for "normal" data.
> > That depends on what your application defines as 'normal data'.
> > Not sure that I fully understand what you want to do so I can't really
> > suggest anything.
> 
> Basicaly all I want is to write some data from the database on one 
> computer to a file and read them in on another. I did not expect to 
> be restricted to "text only".
> 
> Anyway thanks to suggestion by Chris Strom I'll do it this way ... 
> 
> 	1) if the string doesn't contain any "forbidden" characters :
> 
> 		escape what necessary and write "<TAG>$text</TAG>"
> 
> 	2) otherwise
> 
> 		escape & and > as usual, escape the forbidden ones as
> 		&#...;, write "<TAG><![CDATA[$text]]></TAG>" to keep
> 		the escapes away from XML::Parser and when reading
> 		unescape myself.
> 
> It's not nice, but it gets the job done.
> 
You're right, it's not nice and you might find further problems downstream.
You appear to be building in too much 'magic'. Bear in mind that &#x02; is not  a 
valid XML character reference.  Every time that you or someone else processes the
parsed data you have to then expand these pseudo character entities.

If your database data can contain 'binary' values then you might consider encoding
all fields as base64, or at least only those fields which may contain binary data.
base64 encoding increases the size of your data by 1/3 but otherwise is a clean way 
to do it.

Regards,
Duncan Cameron





_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Duncan Cameron
Jenda Krynicky

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved