editdistance

Enum class

class symspellpy.editdistance.DistanceAlgorithm(value)[source]

Supported edit distance algorithms.

DAMERAU_OSA = 1

Damerau optimal string alignment algorithm

DAMERAU_OSA_FAST = 3

Fast Damerau optimal string alignment algorithm

LEVENSHTEIN = 0

Levenshtein algorithm.

LEVENSHTEIN_FAST = 2

Fast Levenshtein algorithm.

USER_PROVIDED = 4

User provided custom edit distance algorithm

EditDistance class

class symspellpy.editdistance.EditDistance(algorithm, comparer=None)[source]

Edit distance algorithms.

Parameters

algorithm (DistanceAlgorithm) – The distance algorithm to use.

_algorithm

The edit distance algorithm to use.

Type

DistanceAlgorithm

_distance_comparer

An object to compute the relative distance between two strings. The concrete object will be chosen based on the value of _algorithm.

Type

AbstractDistanceComparer

Raises

ValueError – If algorithm specifies an invalid distance algorithm.

compare(string_1, string_2, max_distance)[source]

Compares a string to the base string to determine the edit distance, using the previously selected algorithm.

Parameters
  • string_1 (str) – Base string.

  • string_2 (str) – The string to compare.

  • max_distance (int) – The maximum distance allowed.

Return type

int

Returns

The edit distance (or -1 if max_distance exceeded).

Distance comparer classes

class symspellpy.editdistance.DamerauOsa[source]

Provides optimized methods for computing Damerau-Levenshtein Optimal String Alignment (OSA) comparisons between two strings.

_base_char_1_costs
Type

List[int]

_base_prev_char_1_costs
Type

List[int]

distance(string_1, string_2, max_distance)[source]

Computes the Damerau-Levenshtein optimal string alignment edit distance between two strings.

Parameters
  • string_1 (str) – One of the strings to compare.

  • string_2 (str) – The other string to compare.

  • max_distance (int) – The maximum distance that is of interest.

Return type

int

Returns

-1 if the distance is greater than the max_distance, 0 if the strings

are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.

class symspellpy.editdistance.Levenshtein[source]

Provides Levenshtein algorithm for computing edit distance metric between two strings.

_base_char_1_costs
Type

List[int]

distance(string_1, string_2, max_distance)[source]

Computes the Levenshtein edit distance between two strings.

Parameters
  • string_1 (str) – One of the strings to compare.

  • string_2 (str) – The other string to compare.

  • max_distance (int) – The maximum distance that is of interest.

Return type

int

Returns

-1 if the distance is greater than the max_distance, 0 if the strings

are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.

class symspellpy.editdistance.DamerauOsaFast[source]

Provides an interface for computing edit distance metric between two strings using the fast Damerau-Levenshtein Optimal String Alignment (OSA) algorithm.

distance(string_1, string_2, max_distance)[source]

Computes the Damerau-Levenshtein optimal string alignment edit distance between two strings.

Parameters
  • string_1 (str) – One of the strings to compare.

  • string_2 (str) – The other string to compare.

  • max_distance (int) – The maximum distance that is of interest.

Return type

int

Returns

-1 if the distance is greater than the max_distance, 0 if the strings

are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.

class symspellpy.editdistance.LevenshteinFast[source]

Provides an interface for computing edit distance metric between two strings using the fast Levenshtein algorithm.

distance(string_1, string_2, max_distance)[source]

Computes the Levenshtein edit distance between two strings.

Parameters
  • string_1 (str) – One of the strings to compare.

  • string_2 (str) – The other string to compare.

  • max_distance (int) – The maximum distance that is of interest.

Return type

int

Returns

-1 if the distance is greater than the max_distance, 0 if the strings

are equivalent, otherwise a positive number whose magnitude increases as difference between the strings increases.