Let’s see how can we get pandas unique values in column. Pandas Library has two inbuilt functions unique() and drop_duplicate() provide these feature. This article will give you an overview step by step.
pandas unique values in column :
In order to demonstrate this, the first thing is to create the pandas dataframe.
Step 1 :
Firstly, Create a dummy dataframe for the demonstration.
import pandas as pd
# Dataframe object creation using python dict
sample_dict = {
'Country':['India', 'America', 'Africa', 'China', 'UK'],
'Industry':['Technology', 'Farming', 'Technology', 'Technology', 'Farming'],
'Avg':['199', '192', '199', '182', '108'],
'Month':['JAN', 'FEB', 'MAR', 'JAN', 'JAN'],
'Result':['P', 'P', 'P', 'P', 'P'] }
dataframe = pd.DataFrame(sample_dict)
print(dataframe)
Here we can see the most of the columns are having duplicate values. In the next section, we will see how can we filter the unique values out of them.
Step 2:
Secondly, As we have mentioned already that we can use either unique() and drop_duplicate() to achieve it.
Method 1 : Unique values in Pandas using unique() function-
As we can see that the above dataframe has the column “Industry” which has some duplicate values. Using the unique() function we can get unique values.
dataframe.Industry.unique()
Here is the code output for the above code.
In case you only need to know the count of unique values. You may use nunique() function.
dataframe.Industry.nunique(dropna = True)
Once we run the above piece of code we get the output 2. As we have only two unique values as unique in the column.
Method 2: Using drop_duplicates()
Let’s see the uses of drop_duplicates(). Here is the example with the syntax of drop_duplicates() pandas function.
dataframe.Industry.drop_duplicates()
The output for the drop_duplicates() is here.
As we have already seen the Industry column has only two unique values. Hence we got only two values after using the drop_duplicates() function.
We can also convert them into the list by typecasting the object.
uniquevalue=list(dataframe.Industry.drop_duplicates())
type(uniquevalue)
Conclusion –
drop_duplicates() and unique() inbuilt pandas functions are helpful in the above scenario. I hope you must like this simple explanation with coding examples. In case you want any similar topics over pandas or numpy, Please comment below.
Thanks
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.