Re: Finding and indexing 'similar' string
by other posts by this author
Aug 26 2003 6:00PM messages near this date
Re: Squeak History... don't forget the Levco Prodigy! :-)
|
SOAPing IPaqs Squeaky Clean
This is a multipart message in MIME format.
--=_alternative 00630F6085256D8E_=
Content-Type: text/plain; charset="us-ascii"
Chris,
There is also an algorithm called 'Metaphone' that was originally
published in Computer Language in 1990. It does a somewhat better job
of matching similar sounding words (at least in English). The principal
weakness of soundex is that it always uses the first letter of the word,
which can often be spelled differently.
You might also try searches on 'agrep' ("approximate grep") and
'string similarity' and 'approximate string matching' or
'approximate pattern matching' for other references.
Here are a few fairly good references:
http://www.bitmechanic.com/mail-archives/mysql/Jan1998/0666.html
http://aspell.net/metaphone/metaphone-kuhn.txt
http://www.dcc.ufmg.br/~ghuiban/paa/tp3/node18.html
-Dean
Chris Muller <afunkyobject@[...].com>
Sent by: squeak-dev-bounces@[...].org
08/26/03 12:01 PM
Please respond to chris; Please respond to The general-purpose Squeak
developers list
To: Squeak List <squeak-dev@[...].org>
cc:
Subject: Re: Finding and indexing 'similar' string
Jim Menard wrote:
> How about using the Soundex algorithm? A quick Google search found this
> brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm>
Ohhh! Thank you Jim! What a simple, well-explained method for a
sounds-like
index. This would be a great new index type for MagmaCollections..
Do you know whether it works for other keywords? Or just Surnames? I
would
think it would, since some people's surname are regular words anyway..
- Chris
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
--=_alternative 00630F6085256D8E_=
Content-Type: text/html; charset="us-ascii"
<br> <font size=2 face="Courier">Chris,</font>
<br>
<br> <font size=2 face="Courier"> There is also an algorithm calle
d 'Metaphone' that was originally</font>
<br> <font size=2 face="Courier">published in Computer Language in 1990. It does a some
what better job</font>
<br> <font size=2 face="Courier">of matching similar sounding words (at least in English). &n
bsp;The principal</font>
<br> <font size=2 face="Courier">weakness of soundex is that it always uses the first letter
of the word,</font>
<br> <font size=2 face="Courier">which can often be spelled differently.</font>
<br>
<br> <font size=2 face="Courier"> You might also try searches on '
agrep' ("approximate grep") and</font>
<br> <font size=2 face="Courier">'string similarity' and 'approximate string matching' or</fo
nt>
<br> <font size=2 face="Courier">'approximate pattern matching' for other references.</font>
<br>
<br>
<br> <font size=2 face="Courier">Here are a few fairly good references:</font>
<br>
<br> <font size=2 face="Courier"> http://www.bitmechanic.com/mail-
archives/mysql/Jan1998/0666.html</font>
<br> <font size=2 face="Courier"> http://aspell.net/metaphone/meta
phone-kuhn.txt</font>
<br> <font size=2 face="Courier"> http://www.dcc.ufmg.br/~ghuiban/
paa/tp3/node18.html</font>
<br>
<br>
<br> <font size=2 face="Courier"> &nbs
p; -Dean</font>
<br>
<br>
<br>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td>
<td> <font size=1 face="sans-serif"><b>Chris Muller <afunkyobject@yahoo.com></b></font>
<br> <font size=1 face="sans-serif">Sent by: squeak-dev-bounces@[...].org</font>
<p> <font size=1 face="sans-serif">08/26/03 12:01 PM</font>
<br> <font size=1 face="sans-serif">Please respond to chris; Please respond to The general-pu
rpose Squeak developers list </font>
<br>
<td> <font size=1 face="Arial"> </font>
<br> <font size=1 face="sans-serif"> To: &nbs
p;Squeak List <squeak-dev@lists.squeakfoundation.org></font>
<br> <font size=1 face="sans-serif"> cc: &nbs
p;</font>
<br> <font size=1 face="sans-serif"> Subject:
Re: Finding and indexing 'similar' string</font> </table>
<br>
<br>
<br> <font size=2 face="Courier New"><br>
Jim Menard wrote:<br>
<br>
> How about using the Soundex algorithm? A quick Google search found this <br>
> brief explanation <http://www.frontiernet.net/~rjacob/soundex.htm><br>
<br>
Ohhh! Thank you Jim! What a simple, well-explained method for a sounds-like<br>
index. This would be a great new index type for MagmaCollections..<br>
<br>
Do you know whether it works for other keywords? Or just Surnames? I would<br>
think it would, since some people's surname are regular words anyway..<br>
<br>
- Chris<br>
<br>
__________________________________<br>
Do you Yahoo!?<br>
Yahoo! SiteBuilder - Free, easy-to-use web site design software<br>
http://sitebuilder.yahoo.com<br>
<br>
</font>
<br>
<br>
--=_alternative 00630F6085256D8E_=--
Attachments:
unknown1
unknown1
|