ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-ai
perl-ai
AI::Categorizer and Umlauts?
by Robert Barta other posts by this author
Jun 4 2007 7:25PM messages near this date
Add documents to a learner? | AI::Categorizer suggestion for repackaging
Hi,

I seem to have problems with umlauts, such as in words

   Präsentation

When a document is added with

   return new AI::Categorizer::Document(name    =>  $filename,
                                        content =>  $content);

to the collection, after loading and finish, the feature vector
contains only fragments of these words, such as

    pr         =>  1
    sentation  =>  1

Setting the locale on the shell or in Perl does not have any effect

    use locale;

not even with turning on de_AT explicitly.

--

Aaaaaah, lib/AI/Categorizer/Document.pm is NOT using locale and use locale
is very, uhm, local %-)

Patching the file does not seem to break the test cases.

\rho

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved