Handling £ (pound sterling) symbols in content
by Neil Hughes other posts by this author
Jun 27 2009 2:24PM messages near this date
view in the new Beta List Site
Re: Perl XML project contribution - regd.
|
Re: Handling £ (pound sterling) symbols in content
& XSLT I've hit a problem in XML::Twig trying to handle data exported from a
legacy database, but I suspect this is an issue I need to get some
advice on regardless of the parser...
The data contains '£' symbols which I'm struggling to format in XML for
processing later on. The following code might help explain:
------------ BEGIN --------------
use strict;
use warnings;
use XML::Twig;
my $t= XML::Twig-> new();
# this is OK
#my $input = '<?xml
version="1.0"?> <root><item>one</item><item>two</item><item>three</item></root>';
# this is invalid
#my $input = '<?xml version="1.0"?> <root><item>one
£</item> <item>two</item><item>three</item></root>';
# this is OK
#my $input = '<?xml
version="1.0"?> <root><item><![CDATA[one]]></item><item><![CDATA[two]]></item><item><![CDATA[
three]]> </item></root>';
# this is invalid
my $input = '<?xml version="1.0"?> <root><item><![CDATA[one
£]]> </item><item><![CDATA[two]]></item><item><![CDATA[three]]></item></root>';
$t-> parse($input);
$t-> print;
------------ END --------------
Whether I wrap my text data in CDATA or not, as soon as I include a
pound sterling symbol I get the following error:
not well-formed (invalid token) at line 1, column 46, byte 46 at
/usr/local/ActivePerl-5.8/lib/XML/Parser.pm line 187
at /Users/nkh/Documents/Dev/Perl/xml_twig/pound_test1.pl line 14
Byte 46 seems to align with the '£', so I'm wondering what I need to do
to get this character not to break the parser.
--
Neil Hughes
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Neil Hughes
Mirod
Neil Hughes
Grant McLean
Dave Howorth
Neil Hughes
Dave Howorth
|