NLTK edit_distance is a function which computes the distance between strings. It returns the minimum number of operation to match the source string to the target string.
NLTK edit_distance Python Implementation –
Let’s see the syntax then we will follow some examples with detail explanation.
Here we have seen that it returns the distance between two strings. The distance is the minimum number of operation to convert the source string to the target string.
(NLTK edit_distance) Example 1:
Let’s define two strings –
Here both the string source and target are almost same. The only difference is the last extra character “s”. Now lets run the below code and check the output.
import nltk # string decleration source = 'Data Science Learner' target = 'Data Science Learners' #distance calculation distance=nltk.edit_distance(source , target ) print(distance)
NLTK edit_distance ( Example 2) :
Now calculate the edit_distance between two string using python.
import nltk #string decleration source='Data Science Learner' target='Data Learner' #distance calculation distance=nltk.edit_distance(source,target) print(distance)
Here we are getting the distance as 9. Because 7 letters ( Science) are different in the mid of target string ( Data Learner). Also two spaces are missing. Which again adds two operations. Hence we got 9 as final distance between these two strings in using edit_distance NLTK.
Edit_distance solves so many NLP usecases . Specially the similarity analysis of documents with queries. It also helps in making clusters of similar documents. We can also use the edit_distance function in text Recommender like news Recommender etc. But make sure it does not capture semantic meaning of words. It represents the symbolic similarity.
I hope this article must have cleared your concepts on edit_distance() function. If you have doubt over this topic. Please comment below in comment box.
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.