ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl5-porters
perl5-porters
Re: UTF8 in 5.8.1
by Gisle Aas other posts by this author
Feb 28 2005 11:23AM messages near this date
UTF8 in 5.8.1 | Re: UTF8 in 5.8.1
Aaron Sherman <ajs@[...].com>  writes:

>  Is anyone aware of any limitations in 5.8.1 that would lead to a problem
>  using substr on utf8 strings? I'm getting lots of:
>  
>  	Malformed UTF-8 character (unexpected end of string)
>  
>  errors in a function that's dealing only with strings that are read from
>  a file that was written to a file, and is being read back using the
>  :utf8 encoding layer. I had thought that substr was always safe on such
>  strings, but it's starting to look like that was a vain hope....

The :utf8 layer just slaps on the UTF8 flag trusting the data it reads
to be well formed utf8.

You can use the :encoding(UTF-8) layer if you don't trust the file
content to be valid UTF8, and then set $PerlIO::encoding::fallback to
specify what to do with bad sequences.

Regards,
Gisle
Thread:
Aaron Sherman
Gisle Aas
Aaron Sherman
Nicholas Clark
Dan Kogai
Aaron Sherman

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved