ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl5-porters
perl5-porters
Re: UTF8 in 5.8.1
by Aaron Sherman other posts by this author
Mar 1 2005 6:11AM messages near this date
Re: UTF8 in 5.8.1 | Re: UTF8 in 5.8.1
On Mon, 2005-02-28 at 11:23 -0800, Gisle Aas wrote:
>  Aaron Sherman <ajs@[...].com> writes:

>  > errors in a function that's dealing only with strings that are read from
>  > a file that was written to a file, and is being read back using the
>  > :utf8 encoding layer. I had thought that substr was always safe on such
>  > strings, but it's starting to look like that was a vain hope....
>  
>  The :utf8 layer just slaps on the UTF8 flag trusting the data it reads
>  to be well formed utf8.

It was well-formed. I was doing something like this:

script 1:

open(IN, "<:encoding(windows-1252)", "name-list");
open(OUT, "> :utf8", "name-list.utf8");
while(<IN> ) { print OUT $_ }

I then verified that name-list.utf8 contained valid UTF-8 sequences
using od -c

script 2:

my %parts;
my %bits;
open(IN, "<:utf8", "name-list.utf8");
while(<IN> ) {
	chomp;
	$parts{substr($_,-3,2)}{substr($_,-1)}++ if length($_) >  2;
}
foreach my $part2 (keys %parts) {
	my $part = substr($part2,-1); # UTF-8 error here
	foreach my $part1 (keys %{$parts{$part2}}) {
		$bits{$part.$part1}++;
	}
}

This is a contrived simplification, and fails to reproduce the problem,
but it gives you an idea of what I was doing. 5.8.3 does not produce the
same warning.
Thread:
Aaron Sherman
Gisle Aas
Aaron Sherman
Nicholas Clark
Dan Kogai
Aaron Sherman

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved