ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Perl, XML and UTF-8
by Claude Paroz other posts by this author
May 29 2006 3:22PM messages near this date
view in the new Beta List Site
Re: Data with lack of some bits when using PerlSAX | Re: Perl, XML and UTF-8
& XSLT Hi,

I have some Perl (5.8.7) code that read XML (UTF-8 encoded), with
XML::Simple or XML::LibXML, and write content back to a HTML Page
through CGI.

Snippet :

use XML::LibXML;
use CGI qw/:standard/;
use Locale::gettext;

my $q = new CGI;

my $xml = XML::LibXML-> new();
my $data = xml-> parse_file($xmlfile);
my $root = $data-> getDocumentElement;
my @lines  = $root-> getElementsByTagName('sometag');

print $q-> header(-type=>'text/html', -charset=>'UTF-8',
-encoding=> "UTF-8");
print $q-> start_html(-title => gettext("My title")),
	-encoding=> "UTF-8");
	print
$q-> h1($lines->getElementsByTagName('subtag')->item(0)->textContent);
print $q-> end_html;

************* End of Code ***************

My problem is that special characters (accented letters) aren't well
encoded when passed to the HTML output. Each special char is represented
by a question mark inside a square. However, the utf8::is_utf8 function
return 1 for these strings.

I also noted that when some special characters are in a string in the
XML file (e.g. â?¢ (trademark)), the encoding is also OK in the resulting
HTML. Weird...
What could be the problem?

Regards.

Claude

_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Claude Paroz
Tim Brody
Claude Paroz
Andrey Alakozov
Suneet Agera
Dominic Mitchell
Attila Fülöp
Attila Fülöp
$Bill Luebkert
Vikasumit

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved