ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Re: utf-8 (or not) encoding question
by Dominic Mitchell other posts by this author
Dec 11 2004 8:22AM messages near this date
view in the new Beta List Site
Re: utf-8 (or not) encoding question | utf-8 (or not) encoding question
On Fri, Dec 10, 2004 at 12:42:50PM -0800, Joshua Santelli wrote:
>  OK, another quick questions.  It looks like my IO was
>  the problem.  LibXML knew it was UTF-8 (at least
>  $source_xml->encoding said so) but this character came
>  in as UTF-8 and out as Latin-1 here:
>  
>    print $fh $source_xml->toString();
>  
>  When I used LibXML's toFH: 
>  
>    my $rc = $source_xml->toFH($fh);
>  
>  that got it right (or maybe I got lucky).  I opened
>  the file handle with:
>  
>    my $fh = new FileHandle ">$xmlFile";
>  
>  Do I really need to specify the UTF-8 encoding for
>  each file handle something like this?
>  
>    my $fh = new FileHandle ">:encoding(utf-8)
>  $xmlFile";
>  
>  Can I trust that toFH() will do the right thing?  What
>  about XML::LibXML's toFile()?  I don't see much about
>  this in the perldoc.

I'm pretty sure that using the LibXML functions directly will work as
expected.  This is because they are implemented internally in libxml2
rather than using Perl's IO layer.

For any IO that's done with Perl, you have to specify that it is in
UTF-8 mode explicitly, either like you did above, or like this:

    open( my $fh, '> :utf8', $xmlFile )
      or die "open(> $xmlFile): $!\n";

or if the file is already open,

    binmode( $fh, ':utf8' );

-Dom
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Martin Leese
Joshua Santelli
Dominic Mitchell

Privacy Policy | Email Opt-out | Feedback | Syndication
© 2004 ActiveState, a division of Sophos All rights reserved