ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-ai
perl-ai
Re: text categorization with SVM and NaiveBayes
by Ken Williams other posts by this author
Jan 9 2007 4:52AM messages near this date
Re: text categorization with SVM and NaiveBayes | Creating Collection of uncategorized data
On Jan 8, 2007, at 10:51 AM, Tom Fawcett wrote:

>  Just to add a note here: Ken is correct -- both NB and SVMs are  
>  known to be rather poor at providing accurate probabilities.  Their  
>  scores tend to be too extreme.  Producing good probabilities from  
>  these scores is called calibrating the classifier, and it's more  
>  complex than just taking a root of the score.  There are several  
>  methods for calibrating scores.  The good news is that there's an  
>  effective one called isotonic regression (or Pool Adjacent  
>  Violators) which is pretty easy and fast.  The bad news is that  
>  there's no plug-in (ie, CPAN-ready) perl implementation of it (I've  
>  got a simple implementation which I should convert and contribute  
>  someday).
> 
>  If you want to read about classifier calibration, google one of  
>  these titles:
> 
>  "Transforming classifier scores into accurate multiclass  
>  probability estimates"
>  by Bianca Zadrozny and Charles Elkan
> 
>  "Predicting Good Probabilities With Supervised Learning"
>  by A. Niculescu-Mizil and R. Caruana


Cool, thanks for the references.  It might be nice to add somesuch  
scheme to Algorithm::NaiveBayes (and friends), so that the user has a  
choice of several normalization schemes, including "none".  If I get  
a surplus of tuits I'll add it, or if you feel like contributing your  
stuff that would be great too.

  -Ken
Thread:
Zgrim
Ken Williams
Tom Fawcett
Ken Williams

Privacy Policy | Email Opt-out | Feedback | Syndication
© 2004 ActiveState, a division of Sophos All rights reserved