hipdf.core.column.string.StringMethods.edit_distance_matrix#
21 min read time
Applies to Linux
- StringMethods.edit_distance_matrix() SeriesOrIndex#
Computes the edit distance between strings in the series.
The series to compute the matrix should have more than 2 strings and should not contain nulls.
Edit distance is measured based on the Levenshtein edit distance algorithm.
Returns#
- Series of ListDtype(int64)
Assume
Nis the length of this series. The return series containsNlists of sizeN, where thejth number in theith row of the series tells the edit distance between theith string and thejth string of this series. The matrix is symmetric. Diagonal elements are 0.
Examples#
>>> import cudf >>> s = cudf.Series(['abc', 'bc', 'cba']) >>> s.str.edit_distance_matrix() 0 [0, 1, 2] 1 [1, 0, 2] 2 [2, 2, 0] dtype: list