ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Handling £ (pound sterling) symbols in content
by Neil Hughes other posts by this author
Jun 27 2009 2:24PM messages near this date
view in the new Beta List Site
Re: Perl XML project contribution - regd. | Re: Handling £ (pound sterling) symbols in content
& XSLT I've hit a problem in XML::Twig trying to handle data exported from a 
legacy database, but I suspect this is an issue I need to get some 
advice on regardless of the parser...

The data contains '£' symbols which I'm struggling to format in XML for 
processing later on. The following code might help explain:

------------ BEGIN --------------

use strict;
use warnings;

use XML::Twig;

my $t= XML::Twig-> new();

# this is OK
#my $input = '<?xml 
version="1.0"?> <root><item>one</item><item>two</item><item>three</item></root>'; 


# this is invalid
#my $input = '<?xml version="1.0"?> <root><item>one 
£</item> <item>two</item><item>three</item></root>';

# this is OK
#my $input = '<?xml 
version="1.0"?> <root><item><![CDATA[one]]></item><item><![CDATA[two]]></item><item><![CDATA[
three]]> </item></root>'; 


# this is invalid
my $input = '<?xml version="1.0"?> <root><item><![CDATA[one 
£]]> </item><item><![CDATA[two]]></item><item><![CDATA[three]]></item></root>'; 


$t-> parse($input);
$t-> print;

------------ END --------------

Whether I wrap my text data in CDATA or not, as soon as I include a 
pound sterling symbol I get the following error:

not well-formed (invalid token) at line 1, column 46, byte 46 at 
/usr/local/ActivePerl-5.8/lib/XML/Parser.pm line 187
  at /Users/nkh/Documents/Dev/Perl/xml_twig/pound_test1.pl line 14

Byte 46 seems to align with the '£', so I'm wondering what I need to do 
to get this character not to break the parser.

-- 
Neil Hughes
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Neil Hughes
Mirod
Neil Hughes
Grant McLean
Dave Howorth
Neil Hughes
Dave Howorth

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved