Skip to content

optimizer CPU offload doesn't work outside of CUDA #958

Open
@bghira

Description

@bghira

The CPUOptimizerOffload class is very clever, but overly relies on CUDA Streams, which aren't available w/o a CUDA device.

should use torch.cpu.Stream and torch.cpu.current_stream instead.

additionally, pin_memory=True if torch.cuda.is_available() else False as MPS is a unified mem arch.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions