ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Re: Getting encoding declaration with XML::SAX
by A. Pagaltzis other posts by this author
Mar 1 2006 12:53PM messages near this date
view in the new Beta List Site
Re: Getting encoding declaration with XML::SAX | Re: Getting encoding declaration with XML::SAX
& XSLT * Timothy Appnel <tappnel@[...].com>  [2006-03-01 19:30]:
> Nevertheless, I ended recifying the problem by converting
> anything that isn't UTF-8 before parsing. I do a quick regex to
> check if encoding has been declared to know if I need to do a
> conversation and if so, what I'm converting from. So all
> documents get run through the parser as UTF-8 and output as
> UTF-8.

Iâ??m not sure that hack is robust. Though Iâ??m not sure itâ??s not.

Is it necessary tho? The parser is supposed to resolve numeric
character references before handing you character data. Thatâ??s
only possible for arbitrary codepoints if the strings it gives
you are Unicode.

Likewise, if you want to output documents in a particular
encoding, standard operation is that you declare your intent to
the serialiser and then hand it data in Unicode, which it then
converts as necessary, including automatic serialisation of
characters as numeric character references when the codepoint in
question has no equivalent in the target encoding.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/> 
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Timothy Appnel
Dominic Mitchell
A. Pollock
Timothy Appnel
A. Pagaltzis
A. Pagaltzis

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved