Runtimeerror: distributed package doesnt have nccl built in errors mainly if PyTorch Version is not compatible with nccl libraries ( NVIDIA Collective Communication Library ). Actually, in many cases, it happens we install PyTorch CPU Version in place of GPU supportive version. After this when we try to run GPU-friendly distributed code on the CPU-based PyTorch Library we get this type of error. Apart from that, there can be more few more possible reasons for this error. We will discuss all the root causes in detail with their solution in this article. So let’s start.
Runtimeerror: distributed package doesnt have nccl built in ( Steps) –
Please follow the steps in order to save your time while fixing the error.
Step1: Install GPU Compatible
PyTorch –
As I told you most of the developers and data scientists install PyTorch CPU-based library and try to run the distributed code on it. Here is the command you can try to install PyTorch GPU.
For CUDA 10.2 –
Make sure to use the Nvidia Driver version must be greater or equal to version 441.22
pip3 install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
For CUDA 11.1 –
Make sure to use the Nvidia Driver version must be greater or equal to version 456.38
pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
After installation, you can verify the PyTorch Version using the below command.
pip3 show torch
Step2: Reinstall NCCL –
In case you installed NCCL prior but it somehow became incompatible or not working properly. Then the best solution is to reinstall the NCCL package again. Here is the link to download the NCCL package. The NCCL package really accelerates GPU communication very fast. In distributed computing, we require multimode GPU, and for this NCCL is required.
Step 3: Update the environment variable –
Please set the environment :
export NCCL_P2P_DISABLE=1 # Disable P2P communication if necessary
export NCCL_DEBUG=INFO
Step 4: Check GPU Availability –
We are setting up all these parameters to ensure that we can leverage multiple GPUs for processing. But sometimes either because of Hardware failure OS incompatibility or any other reason, we end up with no or One GPU. In that case, we should fix this up. The easiest way to check GPU availability. Please use the below code.
torch.cuda.device_count()
Thanks
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.