SoundEx Key
Algorithm creator(s)
Robert C. Russell, Margaret King
PB author(s)
Don Dickinson
Description
Soundex keys have the property that words pronounced similarly produce the same soundex key, and can thus be used to simplify searches in databases where you know the pronunciation but not the spelling. This soundex function returns a string 4 characters long, starting with a letter.
Note
Please note that there are several variations of Soundex, due to Soundex being "(natural) language-biased".
Source
https://forum.powerbasic.com/forum/user-to-user-discussions/source-code/23715-soundex?t=23090
See also
Source Code
Download source code file soundex.bas (Right-click -> "Save as ...")
' pb_sndx.bas
'
' Soundex routine for PB
'
' By Don Dickinson
' Hereby Public Domain
' Happily submitted to the public domain by the author;
' Use at your own risk.
'
' Based on a news group thread discussing soundex.
' Thanks to Errol Cheverie for sending the thread to me.
'
#if not %def(%PB_SNDX_BAS)
%PB_SNDX_BAS = 1
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
' SoundEx
' Returns the 4 byte soundex string for the passed string.
'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Function SoundEx Alias "SoundEx" _
( ByVal incoming as String ) Export as String
Dim i as Long
Dim sResult as String
Dim sChar as String
'- Clean up the string
'
' 1. Remove all non-alphabetic characters (e.g., commas, spaces).
' 2. Convert all lower-case characters to upper-case characters.
' 3. Move the first letter in the source to the SOUNDEX output buffer.
' 4. Remove the vowels (A, E, I, O, U and Y) and the consonents H and W.
' 5. Make the following substitutions:
' Labials (B,F,P,V) ==> 1
' Gutterals, sibilants (C,G,J,K,Q,S,X,Z) ==> 2
' Dentals (D,T) ==> 3
' Long liquid (L) ==> 4
' Nasals (M,N) ==> 5
' Short liquid (R) ==> 6
' 6. Combine any adjacent identical digits (i.e., eliminate contiguous
' matching digits: for example, 44 becomes just 4).
'
incoming = ucase$(incoming)
sResult = ""
For i = 1 to len(incoming)
'- Stop when we get 4 characters
if len(sResult) >= 4 then exit for
sChar = mid$(incoming, i, 1)
'- Only A-Z are counted
if (sChar <= "Z") and (sChar >= "A") then
if sResult = "" then
sResult = sChar
else
'- No vowels allowed
If instr("AEIOUYHW", sChar) < 1 then
'- What digit corresponds to this letter
if instr("BFPV", sChar) > 0 then
sChar = "1"
elseif instr("CGJKQSXZ", sChar) > 0 then
sChar = "2"
elseif instr("DT", sChar) > 0 then
sChar = "3"
elseif sChar = "L" then
sChar = "4"
elseif instr("MN", sChar) > 0 then
sChar = "5"
elseif sChar = "R" then
sChar = "6"
'- The "else" shouldn't happen
else
sChar = ""
end if
'- If it's the first digit, then just add it.
if len(sResult) < 2 then
sResult = sResult + sChar
'- Make sure not to add a repeating digit.
elseif right$(sResult, 1) <> sChar then
sResult = sResult + sChar
end if
end if
end if
end if
Next i
' 7. Concatenate the first three resulting digits to the SOUNDEX output
' buffer (if there are fewer than 3 resulting digits, pad with 0).
'
Function = Left$(sResult + "0000", 4)
End Function
#endif