public class

Soundex

extends Object
implements StringEncoder
java.lang.Object
   ↳ org.apache.commons.codec.language.Soundex

Class Overview

Encodes a string into a Soundex value. Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes.

Summary

Constants
String US_ENGLISH_MAPPING_STRING This is a default mapping of the 26 letters used in US English.
Fields
public static final Soundex US_ENGLISH An instance of Soundex using the US_ENGLISH_MAPPING mapping.
public static final char[] US_ENGLISH_MAPPING This is a default mapping of the 26 letters used in US English.
Public Constructors
Soundex()
Creates an instance using US_ENGLISH_MAPPING
Soundex(char[] mapping)
Creates a soundex instance using the given mapping.
Soundex(String mapping)
Creates a refined soundex instance using a custom mapping.
Public Methods
int difference(String s1, String s2)
Encodes the Strings and returns the number of characters in the two encoded Strings that are the same.
Object encode(Object pObject)
Encodes an Object using the soundex algorithm.
String encode(String pString)
Encodes a String using the soundex algorithm.
int getMaxLength()
This method is deprecated. This feature is not needed since the encoding size must be constant. Will be removed in 2.0.
void setMaxLength(int maxLength)
This method is deprecated. This feature is not needed since the encoding size must be constant. Will be removed in 2.0.
String soundex(String str)
Retrieves the Soundex code for a given String object.
[Expand]
Inherited Methods
From class java.lang.Object
From interface org.apache.commons.codec.Encoder
From interface org.apache.commons.codec.StringEncoder

Constants

public static final String US_ENGLISH_MAPPING_STRING

This is a default mapping of the 26 letters used in US English. A value of 0 for a letter position means do not encode.

(This constant is provided as both an implementation convenience and to allow Javadoc to pick up the value for the constant values page.)

Constant Value: "01230120022455012623010202"

Fields

public static final Soundex US_ENGLISH

An instance of Soundex using the US_ENGLISH_MAPPING mapping.

public static final char[] US_ENGLISH_MAPPING

This is a default mapping of the 26 letters used in US English. A value of 0 for a letter position means do not encode.

See Also

Public Constructors

public Soundex ()

Creates an instance using US_ENGLISH_MAPPING

public Soundex (char[] mapping)

Creates a soundex instance using the given mapping. This constructor can be used to provide an internationalized mapping for a non-Western character set. Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation contains a default map for US_ENGLISH

Parameters
mapping Mapping array to use when finding the corresponding code for a given character

public Soundex (String mapping)

Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set.

Parameters
mapping Mapping string to use when finding the corresponding code for a given character

Public Methods

public int difference (String s1, String s2)

Encodes the Strings and returns the number of characters in the two encoded Strings that are the same. This return value ranges from 0 through 4: 0 indicates little or no similarity, and 4 indicates strong similarity or identical values.

Parameters
s1 A String that will be encoded and compared.
s2 A String that will be encoded and compared.
Returns
  • The number of characters in the two encoded Strings that are the same from 0 to 4.
Throws
EncoderException if an error occurs encoding one of the strings
See Also

public Object encode (Object pObject)

Encodes an Object using the soundex algorithm. This method is provided in order to satisfy the requirements of the Encoder interface, and will throw an EncoderException if the supplied object is not of type java.lang.String.

Parameters
pObject Object to encode
Returns
  • An object (or type java.lang.String) containing the soundex code which corresponds to the String supplied.
Throws
EncoderException if the parameter supplied is not of type java.lang.String
IllegalArgumentException if a character is not mapped

public String encode (String pString)

Encodes a String using the soundex algorithm.

Parameters
pString A String object to encode
Returns
  • A Soundex code corresponding to the String supplied
Throws
IllegalArgumentException if a character is not mapped

public int getMaxLength ()

This method is deprecated.
This feature is not needed since the encoding size must be constant. Will be removed in 2.0.

Returns the maxLength. Standard Soundex

Returns
  • int

public void setMaxLength (int maxLength)

This method is deprecated.
This feature is not needed since the encoding size must be constant. Will be removed in 2.0.

Sets the maxLength.

Parameters
maxLength The maxLength to set

public String soundex (String str)

Retrieves the Soundex code for a given String object.

Parameters
str String to encode using the Soundex algorithm
Returns
  • A soundex code for the String supplied
Throws
IllegalArgumentException if a character is not mapped