Skip to content

Conversation

@nviets
Copy link
Contributor

@nviets nviets commented Apr 6, 2023

Description of changes

Updated the default cudaPackages from 11.7 to 11.8 to address #222778. Pytorch doesn't support 12, yet, so this is an interim change. A nixpkgs-review with cudaSupport enabled is forthcoming.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Looks good... as long as nixpkgs-review is happy, we're good to go!

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

I didn't think through the scope of nixpkgs-review for bumping CUDA. 11.8 built fine, but now it's downloading 80G worth of stuff to my machine. This may not happen tonight, but I guess we'll see.

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Yeah, running nixpkgs-review is the real hurdle with these version bumps

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Does it have a check-point mechanism? If I stop and restart tomorrow, will it pick up where it left off?

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Not really, but builds and downloads that finish will stay cached in your nix store (as long as you don't garbage collect them!).

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Oh good, thanks for confirming. I'll have to wrap it up tomorrow probably.

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

55 packages failed to build:
  • colmapWithCuda
  • cudaPackages.cuda-samples
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.nvidia_driver
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • mathematica-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist
44 packages built:
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • tiny-cuda-nn

@SomeoneSerge
Copy link
Contributor

Failed derivations

@ofborg ofborg bot added 8.has: clean-up This PR removes packages or removes other cruft 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. labels Apr 6, 2023
@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Thanks @SomeoneSerge. The failure list looks pretty bad. In the case of xgboost (which I see in the fail list), I had intended to use the cudaPackages argument to fix the package to something that worked, and I already know the next version mostly works with 11.8.

What do you suggest we do now?

@SomeoneSerge
Copy link
Contributor

@nviets hmmm, I don't see most of the build logs, got to retrieve them somehow now xD

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

55 packages failed to build:
  • colmapWithCuda
  • cudaPackages.cuda-samples
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.nvidia_driver
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • mathematica-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist
44 packages built:
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • tiny-cuda-nn

@SomeoneSerge
Copy link
Contributor

Oh!

cudatoolkit-11.8.0.drv (derivation hash: w6qvxzm9x48c1mnliivvrd59w6390p5f)

It's #224986 as well 🤦🏻

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Will #224986 directly fix cudatoolkit? If it works, do you want to bump to 11_8 or go straight to 12?

@SomeoneSerge
Copy link
Contributor

@nviets that should fix the patchelf issues in cudaPackages_11_8 and cudaPackages_12, so you should be able to finish this PR: you re-base on master, and we re-run nixpkgs-review

I'd say we keep waiting for pytorch and such before we do cudaPackages_12

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Given PyTorch's sensitivity to the CUDA version, would it make sense to make it configurable there?

@SomeoneSerge
Copy link
Contributor

Given PyTorch's sensitivity to the CUDA version, would it make sense to make it configurable there?

It is, one can always torch.override { cudaPackages = cudaPackages_11_7; }
We just want to have a working and up-to-date default, which is also what we build binary cache for

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

13 packages failed to build:
  • cudaPackages.cuda-samples
  • cudaPackages.nvidia_driver
  • mathematica-cuda
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
86 packages built:
  • colmapWithCuda
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • tiny-cuda-nn
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist

@nviets
Copy link
Contributor Author

nviets commented Apr 8, 2023

Looking much better! How can i see the logs from your build? Sorry I couldn't help more after the initial commit, but my computer doesn't have the horsepower to build all of the downstream stuff.

@SomeoneSerge
Copy link
Contributor

They were supposed to be automatically published, but smth went wrong and I'm omw to get lost in a Finnish "forest"/national park so it's hard for me to check up on it xD

Maybe somebody coild rebuild individual packages, cc @samuela @nixos/cuda-maintainers

@SomeoneSerge
Copy link
Contributor

Fwiw jaxlib had been failing last time I checked, idk if anyone fixed it yet

@SomeoneSerge
Copy link
Contributor

The other failures aren't new either

@nviets
Copy link
Contributor Author

nviets commented Apr 9, 2023

Sounds like an awesome trip!

Thanks for running another nixpkgs-review. If this PR isn't breaking anything new, I think it's ready for review. Happy to recieve more feedback.

@nviets nviets marked this pull request as ready for review April 9, 2023 01:04
@SomeoneSerge SomeoneSerge requested a review from samuela April 9, 2023 19:45
@samuela
Copy link
Member

samuela commented Apr 10, 2023

tried running nixpkgs-review, ran into Mic92/nixpkgs-review#328

@piegamesde
Copy link
Member

It looks like you accidentally mass-pinged a bunch of people, which are now subscribed
and getting notifications for everything in this pull request. Unfortunately, they
cannot be automatically unsubscribed from the issue (removing review request does not
unsubscribe), therefore development cannot continue in this pull request anymore.

Please open a new pull request with your changes, link back to this one and ping the
people actually involved in here over there.

In order to avoid this in the future, there are instructions for how to properly
rebase between branches in our contribution guidelines.
Setting your pull request to draft prior to rebasing is strongly recommended.
In draft status, you can preview the list of people that are about to be requested
for review, which allows you to sidestep this issue.
This is not a bulletproof method though, as OfBorg still does review requests even on draft PRs.

@NixOS NixOS locked and limited conversation to collaborators Jun 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

6.topic: cinnamon Desktop environment 6.topic: cuda Parallel computing platform and API 6.topic: emacs Text editor 6.topic: erlang General-purpose, concurrent, functional high-level programming language 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 6.topic: GNOME GNOME desktop environment and its underlying platform 6.topic: golang Go is a high-level general purpose programming language that is statically typed and compiled. 6.topic: haskell General-purpose, statically typed, purely functional programming language 6.topic: kernel The Linux kernel 6.topic: lua Lua is a powerful, efficient, lightweight, embeddable scripting language. 6.topic: mate The MATE Desktop Environment 6.topic: nim Nim programing language 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: ocaml OCaml is a general-purpose, high-level, multi-paradigm programming language. 6.topic: pantheon The Pantheon desktop environment 6.topic: policy discussion Discuss policies to work in and around Nixpkgs 6.topic: python Python is a high-level, general-purpose programming language. 6.topic: qt/kde Object-oriented framework for GUI creation 6.topic: ruby A dynamic, open source programming language with a focus on simplicity and productivity. 6.topic: rust General-purpose programming language emphasizing performance, type safety, and concurrency. 6.topic: systemd Software suite that provides an array of system components for Linux operating systems. 6.topic: vim Advanced text editor 6.topic: vscode A free and versatile code editor that supports almost every major programming language. 6.topic: xfce The Xfce Desktop Environment 8.has: changelog This PR adds or changes release notes 8.has: clean-up This PR removes packages or removes other cruft 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.