ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> squeak
squeak
Re: Finding and indexing 'similar' string
by Julian Fitzell other posts by this author
Aug 26 2003 4:54PM messages near this date
Re: Finding and indexing 'similar' string | Re: Finding and indexing 'similar' string
Jim Menard wrote:
>  Chris,
>  
>  On Tuesday, August 26, 2003, at 12:01  PM, Chris Muller wrote:
>  
> > Jim Menard wrote:
> >
> >> How about using the Soundex algorithm? A quick Google search found this
> >> brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm>
> >
> >
> > Ohhh!  Thank you Jim!  What a simple, well-explained method for a 
> > sounds-like
> > index.  This would be a great new index type for MagmaCollections..
> >
> > Do you know whether it works for other keywords?  Or just Surnames?  I 
> > would
> > think it would, since some people's surname are regular words anyway..
>  
>  
>  It works for any words because it is based on how they sound. I have 
>  read about one problem with the algorithm, though: you need different 
>  sets of characters and weightings for different languages. For example, 
>  I think you would want "j" and "h" to map to the same sound in Mexican 
>  Spanish. (Forgive me if that's a bad example. The only Spanish I've ever 
>  learned was "May I have another beer, please?" and "Where is the 
>  bathroom?")
>  
>  Jim

The other problem with it, as I recall, is that you the first letter 
needs to be the same.  So a name/word that starts with 'ph' won't ever 
match a word that starts with 'f', for example, even if they sound the 
same.  Other than that, though, it works great: we used it for a sales 
system and it allowed users to stop asking people to spell their names 
over the phone.  I've tried typing in every convoluted spelling of my 
name I can think of and it always finds me :)

Julian
Thread:
Chris Muller
Jim Menard
Julian Fitzell
Avi Bryant

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved