ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl5-porters
perl5-porters
[perl #57040] pos() function doesn't handle unicode well
by Marcela Maslanova other posts by this author
Jul 17 2008 6:49AM messages near this date
Re: [perl #57042] regression with $^R in regex in perl 5.10 from 5.8.8 | Re: [perl #57040] pos() function doesn't handle unicode well
# New Ticket Created by  Marcela Maslanova 
# Please include the string:  [perl #57040]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=57040 > 


generated with the help of perlbug 1.36 running under perl 5.10.0.


-----------------------------------------------------------------
[Please enter your report here]

Function pos() doesn't return correct values for unicode strings.
For example:
perl -e '$string = "Ä?ščÅ?žýáíéÅ?";while ($string =~ /Å¡/gi) {printf "Found 
Å¡ at %d\n", pos($string)-1;}';

In this case it could be solved 'use utf8'. But the problem is still in 
other functions, which are
using pos(). For example expand from Text::Tabs:
perl -e'chop($ustr="\taa\t..\t\x{100}");for my 
$s("\t\x{010a}\x{010a}\t..\t","\taa\t..\t",$ustr){ 
$_=$s;s/\t/print(pos(),$");"\t"/ge; print "\n"}'
Here should be all numbers the same.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=medium
---
This perlbug was built using Perl 5.10.0 in the Fedora build system.
It is being executed now by Perl 5.10.0 - Wed Jul  2 05:13:09 EDT 2008.

Site configuration information for perl 5.10.0:

Configured by Red Hat, Inc. at Wed Jul  2 05:13:09 EDT 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.18-92.1.6.el5, archname=i386-linux-thread-multi
    uname='linux x86-6 2.6.18-92.1.6.el5 #1 smp fri jun 20 02:36:06 edt 
2008 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic 
-fasynchronous-unwind-tables -DPERL_USE_SAFE_PUTENV -Dversion=5.10.0 
-Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red 
Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr 
-Dprivlib=/usr/lib/perl5/5.10.0 
-Dsitelib=/usr/local/lib/perl5/site_perl/5.10.0 
-Dvendorlib=/usr/lib/perl5/vendor_perl/5.10.0 
-Darchlib=/usr/lib/perl5/5.10.0/i386-linux-thread-multi 
-Dsitearch=/usr/local/lib/perl5/site_perl/5.10.0/i386-linux-thread-multi 
-Dvendorarch=/usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi 
-Darchname=i386-linux-thread-multi 
-Dotherlibdirs=/usr/lib/perl5/site_perl/5.10.0 -Dvendorprefix=/usr 
-Dsiteprefix=/usr/local -Duseshrplib -Dusethreads -Duseithreads 
-Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm 
-Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n 
-Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr 
-Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto 
-Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto 
-Ud_setservent_r_proto -Dscriptdir=/usr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING 
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE 
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
-fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 
-mtune=generic -fasynchronous-unwind-tables -DPERL_USE_SAFE_PUTENV',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING 
-fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='4.3.0 20080428 (Red Hat 4.3.0-8)', 
gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.8.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.8'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E 
-Wl,-rpath,/usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic 
-fasynchronous-unwind-tables -DPERL_USE_SAFE_PUTENV -L/usr/local/lib'

Locally applied patches:
   

---
@INC for perl 5.10.0:
    /usr/lib/perl5/5.10.0/i386-linux-thread-multi
    /usr/lib/perl5/5.10.0
    /usr/local/lib/perl5/site_perl/5.10.0/i386-linux-thread-multi
    /usr/local/lib/perl5/site_perl/5.10.0
    /usr/lib/perl5/vendor_perl/5.10.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.10.0
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/site_perl/5.10.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.10.0
    .

---
Environment for perl 5.10.0:
    HOME=/home/marca
    LANG=en_US.UTF-8
    LANGUAGE=
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    
PATH=/usr/lib/qt-3.3/bin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/home/marca/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash
Thread:
Marcela Maslanova
Eric Brine
Moritz Lenz

Privacy Policy | Email Opt-out | Feedback | Syndication
© 2004 ActiveState, a division of Sophos All rights reserved