Re: Getting encoding declaration with XML::SAX
by A. Pagaltzis other posts by this author
Mar 1 2006 12:53PM messages near this date
view in the new Beta List Site
Re: Getting encoding declaration with XML::SAX
|
Re: Getting encoding declaration with XML::SAX
& XSLT * Timothy Appnel <tappnel@[...].com> [2006-03-01 19:30]:
> Nevertheless, I ended recifying the problem by converting
> anything that isn't UTF-8 before parsing. I do a quick regex to
> check if encoding has been declared to know if I need to do a
> conversation and if so, what I'm converting from. So all
> documents get run through the parser as UTF-8 and output as
> UTF-8.
Iâ??m not sure that hack is robust. Though Iâ??m not sure itâ??s not.
Is it necessary tho? The parser is supposed to resolve numeric
character references before handing you character data. Thatâ??s
only possible for arbitrary codepoints if the strings it gives
you are Unicode.
Likewise, if you want to output documents in a particular
encoding, standard operation is that you declare your intent to
the serialiser and then hand it data in Unicode, which it then
converts as necessary, including automatic serialisation of
characters as numeric character references when the codepoint in
question has no equivalent in the target encoding.
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Timothy Appnel
Dominic Mitchell
A. Pollock
Timothy Appnel
A. Pagaltzis
A. Pagaltzis
|