Scipy Stats Pearsonr Implementation in Python

Scipy Stats Pearsonr Implementation in Python

GET FREE AMZAON AUDIOBOOKS

Scipy stats pearsonr coefficient helps to find the correlation between two datasets. The pearsonr coefficient  value always be in range [-1,1]. In this article, We will implement pearsonr cofficient in Python.

 

Scipy stats pearsonr Implementation-

We will implement the pearsonr coefficient in steps. We will also interprete the coefficient.

Step 1 : Import –

Firstly, In this step, We will import the stats module from scipy package. Here is the way to do it.

from scipy import stats
import numpy as np

We are also importing the numpy because it is require in dataset creation step.

 

Step 2: Dataset creation-

Secondly, In this step, We will create two numpy array as dataset. This numpy object will be input parameter in the further process.

array_1 = np.array([0, 0, 0, 1, 1, 1, 1])
array_2 = np.array([1, 1, 1, 0, 0, 0, 0])

We can take larger array with more elements. But here just for demonstrating this functionality, we used the small size array.

 

Step 3: Calculating pearsonr cofficient –

stats.pearsonr()function takes the numpy arrays are argument. It will return pearsonr coefficient. If the value is near to -1, It is high negative coefficient. In the opposite side, If the value is near to 1, It will be highly positive coefficient. Negative coefficient means value in one dataset increase, It will decrease the value of other dataset.

stats.pearsonr(array_1,array_2)

If we have positive coefficient, It means increasing the value of a dataset , will increase the second dataset as well.

Complete code-

If we integrate the code for all the three steps.

from scipy import stats
import numpy as np
array_1 = np.array([0, 0, 0, 1, 1, 1, 1])
array_2 = np.array([1, 1, 1, 0, 0, 0, 0])
stats.pearsonr(array_1,array_2)
scipy stats pearsonr
scipy stats pearsonr

In the output, We are getting very high negative coefficient because when increase values in first array. It will decrease the values in second array.

Conclusion-

scipy stats pearsonr coefficient helps in identifying dataset trend. It also helps to identify the cause and effect type of events. I hope this article, must have cleared your doubts related to the scipy stats pearsonr. If you want to contribute some detail over the same topic, Please comment below.

 

Thanks
Data Science Learner

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner