Skip to content

The more NCCL communicators, the more GPU memory? #864

@ForFishes

Description

@ForFishes

Please tell me about the relationship between GPU memory usage and the number of NCCL communicators.

  1. Does it mean that the more NCCL communicators are created, the more GPU memory will be used? Is it a linear relationship?

It is found here that when an NCCL communicator is established and allreduce communication is performed, the memory occupies about 1.6GB; when multiple NCCL communicators (>=3) are established, the memory occupies about 6GB. If a lot of nccl connections are established, will it be OOM?

  1. Is there any relationship between the memory usage of NCCL and the number of GPUs?

Is there any difference in the amount of memory used when hundreds of GPUs establish NCCL connections and dozens of GPUs establish NCCL connections?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions