Skip to content

Distributed package doesn't have NCCL built in #50

@alescire94

Description

@alescire94

Got the following error when executing:
torchrun --nproc_per_node 1 example.py --ckpt_dir models/7B --tokenizer_path models/tokenizer.model

image

additional info:
cuda: 11.4
GPU: NVIDIA GeForce 3090
torch 1.12.1
Ubuntu 20.04.2 LTS

Anyone knows how to solve it?
Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    compatibilityissues arising from specific hardware or system configsdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions