Numpy is a great python module for mathematical computation. You can easily manipulate mathematics calculations fastly. In this entire article, I will show you how to do numpy correlation using numpy.correlate method.
What is the Correlation?
Before moving to the coding demonstration part let’s first understand what is correlation.
Correlation tells how one variable behaves with the other variable.
There are two components of correlation. One is the magnitude and the other is a sign. In fact, The larger the magnitude, the larger is the correlation value. Its value range from -1 to +1.
If the result is less than 0 then it negatively correlated. And if it’s greater than 0 then it’s positively correlated.
Steps by Steps for doing Numpy Correlation
Here is the coding part for finding the correlation between the two variables. However, you have to just follow the steps for a better understanding.
Step 1: Import all the necessary Libraries.
Here I am using only one library for the entire coding demonstration that is Numpy.
import numpy as np
import matplotlib.pyplot as plt
Step 2: Create two arrays or vectors.
The next step is to create two arrays x and y to find numpy correlation between two arrays. Both the arrays are of type integer randomly created using the randint() method.
np.random.seed(5)
x = np.random.randint(0,100,500)
y = x + np.random.randint(0,50,500)
Here First I am passing the seed value 5 to make sure you get the same output as I am getting. Then I am creating two arrays x and y. The Y variable is dependent on the value of x. It allows you to find the correlation between these two arrays.
Step 3: Calculate the Numpy Correlation.
You will get the correlation matrix using the numpy.corrcoef() method. Below is the full code with the output.
import numpy as np
def main():
np.random.seed(5)
x = np.random.randint(0,100,500)
y = x + np.random.randint(0,50,500)
corr = np.corrcoef(x,y)
print(corr)
if __name__ == '__main__':
main()
Output
The correlation between the two arrays is 0.88. It shows that these two variables are highly positively correlated.
Negative Numpy correlation between two vectors or arrays
The above example was calculating the positive correlation. Let’s create two vectors that are negatively correlated.
np.random.seed(5)
x = np.random.randint(0, 100, 500)
y = 100- x + np.random.randint(0, 50, 500)
After that, you will find the correlation between them using the same method. numpy.corrcoef() method. Below is the full code with the output.
import numpy as np
def main():
np.random.seed(5)
x = np.random.randint(0, 100, 500)
y = 100- x + np.random.randint(0, 50, 500)
corr = np.corrcoef(x, y)
print(corr)
if __name__ == '__main__':
main()
Output
The correlation between the two arrays is – 0.89. It shows that these two variables are highly negatively correlated.
Weak Numpy correlation between two vectors or arrays
Using the same method numpy.corrcoef() you can also find the weak correlation between the two arrays. Weak correlations found when the variables are independent of each other. Let’s create two independent variables and use the above correlation method.
np.random.seed(5)
x = np.random.randint(0, 100, 500)
y = np.random.randint(0, 50, 500)
When you calculate the NumPy correlation then you will find a correlation value close to 0. It clearly indicates that it has no or weak correlation.
Below is the full code with the output.
import numpy as np
def main():
np.random.seed(5)
x = np.random.randint(0, 100, 500)
y = np.random.randint(0, 50, 500)
corr = np.corrcoef(x, y)
print(corr)
if __name__ == '__main__':
main()
Output
That’s all for now. I hope you have easily understood how to find numpy correlation of array. Even if you have any query then you can contact us for more information.
Source:
Official Correlation Documentation
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.