editdistance¶
Enum class¶
- class symspellpy.editdistance.DistanceAlgorithm(value)[source]¶
Supported edit distance algorithms.
- DAMERAU_OSA = 1¶
Damerau optimal string alignment algorithm
- DAMERAU_OSA_FAST = 3¶
Fast Damerau optimal string alignment algorithm
- LEVENSHTEIN = 0¶
Levenshtein algorithm.
- LEVENSHTEIN_FAST = 2¶
Fast Levenshtein algorithm.
- USER_PROVIDED = 4¶
User provided custom edit distance algorithm
EditDistance class¶
- class symspellpy.editdistance.EditDistance(algorithm, comparer=None)[source]¶
Edit distance algorithms.
- Parameters
algorithm (
DistanceAlgorithm
) – The distance algorithm to use.
- _algorithm¶
The edit distance algorithm to use.
- Type
- _distance_comparer¶
An object to compute the relative distance between two strings. The concrete object will be chosen based on the value of
_algorithm
.- Type
AbstractDistanceComparer
- Raises
ValueError – If algorithm specifies an invalid distance algorithm.
- compare(string_1, string_2, max_distance)[source]¶
Compares a string to the base string to determine the edit distance, using the previously selected algorithm.
- Parameters
string_1 (
str
) – Base string.string_2 (
str
) – The string to compare.max_distance (
int
) – The maximum distance allowed.
- Return type
int
- Returns
The edit distance (or -1 if max_distance exceeded).
Distance comparer classes¶
- class symspellpy.editdistance.DamerauOsa[source]¶
Provides optimized methods for computing Damerau-Levenshtein Optimal String Alignment (OSA) comparisons between two strings.
- _base_char_1_costs¶
- Type
List[int]
- _base_prev_char_1_costs¶
- Type
List[int]
- distance(string_1, string_2, max_distance)[source]¶
Computes the Damerau-Levenshtein optimal string alignment edit distance between two strings.
- Parameters
string_1 (
str
) – One of the strings to compare.string_2 (
str
) – The other string to compare.max_distance (
int
) – The maximum distance that is of interest.
- Return type
int
- Returns
- -1 if the distance is greater than the max_distance, 0 if the strings
are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.
- class symspellpy.editdistance.Levenshtein[source]¶
Provides Levenshtein algorithm for computing edit distance metric between two strings.
- _base_char_1_costs¶
- Type
List[int]
- distance(string_1, string_2, max_distance)[source]¶
Computes the Levenshtein edit distance between two strings.
- Parameters
string_1 (
str
) – One of the strings to compare.string_2 (
str
) – The other string to compare.max_distance (
int
) – The maximum distance that is of interest.
- Return type
int
- Returns
- -1 if the distance is greater than the max_distance, 0 if the strings
are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.
- class symspellpy.editdistance.DamerauOsaFast[source]¶
Provides an interface for computing edit distance metric between two strings using the fast Damerau-Levenshtein Optimal String Alignment (OSA) algorithm.
- distance(string_1, string_2, max_distance)[source]¶
Computes the Damerau-Levenshtein optimal string alignment edit distance between two strings.
- Parameters
string_1 (
str
) – One of the strings to compare.string_2 (
str
) – The other string to compare.max_distance (
int
) – The maximum distance that is of interest.
- Return type
int
- Returns
- -1 if the distance is greater than the max_distance, 0 if the strings
are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.
- class symspellpy.editdistance.LevenshteinFast[source]¶
Provides an interface for computing edit distance metric between two strings using the fast Levenshtein algorithm.
- distance(string_1, string_2, max_distance)[source]¶
Computes the Levenshtein edit distance between two strings.
- Parameters
string_1 (
str
) – One of the strings to compare.string_2 (
str
) – The other string to compare.max_distance (
int
) – The maximum distance that is of interest.
- Return type
int
- Returns
- -1 if the distance is greater than the max_distance, 0 if the strings
are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.