spacy lemmatization Implementation in Python : 4 Steps only

spacy lemmatization Implementation in Python

Spacy is a free and open-source library for advanced Natural Language Processing(NLP) in Python. It is basically designed for production use and helps you to build applications that process and understand large volumes of text. In this tutorial, I will explain to you how to implement spacy lemmatization in python through steps.

What is a Spacy Lemmatization?

Lemmatization is one of the common text pre-processing tasks in NLP that reduces a given word to its root word. For example cars, car’s will be lemmatized into car. In the same way, are, is, am is lemmatized to be.

Steps to Implement Lemmatization

In this section, you will know all the steps required to implement spacy lemmatization. Make sure you have installed spacy in your system before following the steps. Also, you should follow all the steps for deep understanding.

Step 1: Import required package

The first step is to import all the necessary libraries. In my example, I am using spacy only so let’s import it using the import statement.

import spacy

Step 2: Load your language model

There are many languages where you can perform lemmatization. You can find them in spacy documentation. In my example, I am using the English language model so let’s load them using the spacy.load() method. But make sure you have downloaded the model in your system.

Download the English model

python -m spacy download en_core_web_sm

Load the English model

nlp = spacy.load("en_core_web_sm")

 

Step 3: Make a Sample Document

Before doing the spacy lemmatization let’s first make an NLP document. To do so you have to use nlp() method. Add the below line of code.

doc = nlp("Welcome to the Data Science Learner! . Here you will learn all things about data science , machine learning , artifical intelligence and more" )

Step 4: Implement spacy lemmatization on the document

Now the last step is to lemmatize the document you have created. To do so you have to use the for loop and pass each lemmatize word to the empty list. Execute the complete code given below.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("Welcome to the Data Science Learner! . Here you will learn all things about data science , machine learning , artifical intelligence and more." )
empty_list = []
for token in doc:
    empty_list.append(token.lemma_)

final_string = ' '.join(map(str,empty_list))
print(final_string)

Output

Spacy lemmatization output
Spacy lemmatization output

There are the steps for doing the spacy lemmatization of any document. Here I have used small text but you can use large documents for lemmatization. I hope you have liked this tutorial. If you have any queries then you can contact us for more help.

You may read about spacy tokenization.

Source:

Spacy Language Model

Spacy Documentation

 

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Sukesh ( Chief Editor ), a passionate and skilled Python programmer with a deep fascination for data science, NumPy, and Pandas. His journey in the world of coding began as a curious explorer and has evolved into a seasoned data enthusiast.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner