Re: utf-8 (or not) encoding question
by Dominic Mitchell other posts by this author
Dec 11 2004 8:22AM messages near this date
view in the new Beta List Site
Re: utf-8 (or not) encoding question
|
utf-8 (or not) encoding question
On Fri, Dec 10, 2004 at 12:42:50PM -0800, Joshua Santelli wrote:
> OK, another quick questions. It looks like my IO was
> the problem. LibXML knew it was UTF-8 (at least
> $source_xml->encoding said so) but this character came
> in as UTF-8 and out as Latin-1 here:
>
> print $fh $source_xml->toString();
>
> When I used LibXML's toFH:
>
> my $rc = $source_xml->toFH($fh);
>
> that got it right (or maybe I got lucky). I opened
> the file handle with:
>
> my $fh = new FileHandle ">$xmlFile";
>
> Do I really need to specify the UTF-8 encoding for
> each file handle something like this?
>
> my $fh = new FileHandle ">:encoding(utf-8)
> $xmlFile";
>
> Can I trust that toFH() will do the right thing? What
> about XML::LibXML's toFile()? I don't see much about
> this in the perldoc.
I'm pretty sure that using the LibXML functions directly will work as
expected. This is because they are implemented internally in libxml2
rather than using Perl's IO layer.
For any IO that's done with Perl, you have to specify that it is in
UTF-8 mode explicitly, either like you did above, or like this:
open( my $fh, '> :utf8', $xmlFile )
or die "open(> $xmlFile): $!\n";
or if the file is already open,
binmode( $fh, ':utf8' );
-Dom
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Martin Leese
Joshua Santelli
Dominic Mitchell
|