ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> squeak
squeak
Re: Finding and indexing 'similar' string
by Jim Menard other posts by this author
Aug 26 2003 4:10PM messages near this date
Re: Finding and indexing 'similar' string | Re: Finding and indexing 'similar' string
Chris,

On Tuesday, August 26, 2003, at 12:01  PM, Chris Muller wrote:

>  Jim Menard wrote:
> 
> > How about using the Soundex algorithm? A quick Google search found 
> > this
> > brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm>
> 
>  Ohhh!  Thank you Jim!  What a simple, well-explained method for a 
>  sounds-like
>  index.  This would be a great new index type for MagmaCollections..
> 
>  Do you know whether it works for other keywords?  Or just Surnames?  I 
>  would
>  think it would, since some people's surname are regular words anyway..

It works for any words because it is based on how they sound. I have 
read about one problem with the algorithm, though: you need different 
sets of characters and weightings for different languages. For example, 
I think you would want "j" and "h" to map to the same sound in Mexican 
Spanish. (Forgive me if that's a bad example. The only Spanish I've 
ever learned was "May I have another beer, please?" and "Where is the 
bathroom?")

Jim
-- 
Jim Menard, jimm@[...].com, http://www.io.com/~jimm/
"333: Eric the Half A Beast" -- Tim Allen in rec.humor.oracle.d
Thread:
Chris Muller
Jim Menard
Julian Fitzell
Avi Bryant

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved