Runtimeerror: cuda error: invalid device ordinal error occurs if we use any GPU id that is currently configured in the system. When we use any deep learning module like EmotionRecognizer(device=’gpu’,gpu_id=2), we define the GPU id and if it is not configured accordingly then we get this error. Apart from this we also get this error if code by confusion starts counting the GPU Devices from one onwards. But it starts from zero onwards.
There are two approaches to solving this error. Let’s explore them one by one.
Mostly this will fix the issue. If you are not very sure about available GPU devices in the system. Then go for zero as GPU ID. Lets understand with an example.
emotion_detector = EmotionRecognition(device='gpu', gpu_id=0)
Make sure
Suppose you have misconfigured CUDA_VISIBLE_DEVICES earlier then you can use the command.
unset CUDA_VISIBLE_DEVICES
It will reset the counter and fix the references.
Sometimes either because of CUDA driver failure or Hardware Failure if the Interpreter is not able to find all connected GPUs then it will again throw the same error. Here is the code to identify whether GPU is available or not ??
import torch
import time
print(torch.version)
print(torch.cuda.is_available())
Let’s run and check.
As you can see how can we verify PyTorch Version and check the number of available GPUs with CUDA Support? If no GPUs are there then convert the code CPU friendly.
if you are still getting the same error then you must try to upgrade CUDA Tool kit by downloading it for Windows and Linux. For Torch, you can use pip package manager to upgrade.
Thanks
Data Science Learner Team