Running RCCL using Docker#
To use Docker to run RCCL, Docker must already be installed on the system. To build the Docker image and run the container, follow these steps.
- Build the Docker image - By default, the Dockerfile uses - docker.io/rocm/dev-ubuntu-22.04:latestas the base Docker image. It then installs RCCL and rccl-tests (in both cases, it uses the version from the RCCL- developbranch).- Use this command to build the Docker image: - docker build -t rccl-tests -f Dockerfile.ubuntu --pull . - The base Docker image, rccl repository, and rccl-tests repository can be modified by using - --build-argsin the- docker buildcommand above. For example, to use a different base Docker image, use this command:- docker build -t rccl-tests -f Dockerfile.ubuntu --build-arg="ROCM_IMAGE_NAME=rocm/dev-ubuntu-20.04" --build-arg="ROCM_IMAGE_TAG=6.2" --pull . 
- Launch an interactive Docker container on a system with AMD GPUs: - docker run -it --rm --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --network=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined rccl-tests /bin/bash 
To run, for example, the all_reduce_perf test from rccl-tests on 8 AMD GPUs from inside the Docker container, use this command:
mpirun --allow-run-as-root -np 8 --mca pml ucx --mca btl ^openib -x NCCL_DEBUG=VERSION /workspace/rccl-tests/build/all_reduce_perf -b 1 -e 16G -f 2 -g 1
For more information on the rccl-tests options, see the Usage guidelines in the GitHub repository.