Encode and emitting the little endian form of UTF-16 (not UTF-16LE)
by Demerphq other posts by this author
May 23 2007 8:53AM messages near this date
Re: The State of BigNumber support in Perl
|
Re: Encode and emitting the little endian form of UTF-16 (not UTF-16LE)
Hi Dan,
I was wondering if there is some way to get Encode to emit the little
endian version of UTF-16 (with BOM) as a typical Win32 on Intel app
would do. It seems to me that currently
my $octets= encode('UTF-16',$string);
will only emit the big-endian form of it.
Of course well behaved apps shouldnt care, but some do, also i know I
can hand emit the BOM myself like so:
my $octets= encode('UTF-16LE',chr(0xFEFF).$string);
but this strck me as a bit convoluted and makes it a bit tricky to do
with IO layers. If there isnt a way to do it currently maybe the name
'UTF-16:le' or something similar could be used for this?
Also it looks like there is a typo in the quick reference table of
Encode::Unicode:
Quick Reference
Decodes from ord(N) Encodes chr(N) to...
octet/char BOM S.P d800-dfff ord > 0xffff \x{1abcd} ==
---------------+-----------------+------------------------------
UCS-2BE 2 N N is bogus Not Available
UCS-2LE 2 N N bogus Not Available
UTF-16 2/4 Y Y is S.P S.P BE/LE
UTF-16BE 2/4 N Y S.P S.P 0xd82a,0xdfcd
UTF-16LE 2 N Y S.P S.P 0x2ad8,0xcddf
UTF-32 4 Y - is bogus As is BE/LE
UTF-32BE 4 N - bogus As is 0x0001abcd
UTF-32LE 4 N - bogus As is 0xcdab0100
UTF-8 1-4 - - bogus > = 4 octets \xf0\x9a\af\8d
---------------+-----------------+------------------------------
Shouldnt UTF-16LE also be 2/4 like the other UTF-16 variants?
cheers,
yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread:
Demerphq
Tels
Demerphq
Tels
Demerphq
Tels
|