Skip to content

feat(gpu_prover): cap device allocation via PROVER_GPU_MEMORY_FRACTION#312

Open
ericker-cyfrin wants to merge 2 commits into
matter-labs:mainfrom
ericker-cyfrin:feat/gpu-memory-fraction
Open

feat(gpu_prover): cap device allocation via PROVER_GPU_MEMORY_FRACTION#312
ericker-cyfrin wants to merge 2 commits into
matter-labs:mainfrom
ericker-cyfrin:feat/gpu-memory-fraction

Conversation

@ericker-cyfrin

Copy link
Copy Markdown

ProverContext::new() allocates all free GPU memory, which prevents co-locating another GPU process (e.g. a SNARK prover) on the same device. Add an optional PROVER_GPU_MEMORY_FRACTION env var (0 < f <= 1), read in ProverContextConfig::default(); when set, device allocation is capped at min(free, total * fraction). Unset preserves the previous "allocate all free memory" behavior, so this is a no-op by default.

What ❔

It introduces an environment variable that is a decimal value between 0<1 that limits the amount of memory of a GPU that the FRI prover takes upon startup

Why ❔

There are times when you want to run both the FRI and SNARK provers on the same GPU. By limiting the amount of memory allocated to the FRI prover on startup, the SNARK prover is allowed enough memory to function.

Is this a breaking change?

  • Yes
  • No

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted.

ProverContext::new() allocates all free GPU memory, which prevents
co-locating another GPU process (e.g. a SNARK prover) on the same
device. Add an optional PROVER_GPU_MEMORY_FRACTION env var (0 < f <= 1),
read in ProverContextConfig::default(); when set, device allocation is
capped at min(free, total * fraction). Unset preserves the previous
"allocate all free memory" behavior, so this is a no-op by default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant