Self-contained Navi 3(gfx110x) and Strix Halo (gfx1151) Pytorch Wheels #655

scottt · 2025-05-19T15:51:51Z

scottt
May 19, 2025

Download the gfx1151 from https://github.com/scottt/rocm-TheRock/releases/v6.5.0rc-pytorch
Download the Windows gfx110x and gfx1201 wheel from https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x

These are built by @jammm and myself (@scottt)

Features and Known Problems

torch.nn.functional.scaled_dot_product_attention backed by aotriton 0.9.2
- No need to set any environment variables
The Linux wheel is bigger than it should be due to symlink becoming duplicated files and inclusion of unnecessary files

ScottTodd · 2025-05-19T15:59:35Z

ScottTodd
May 19, 2025
Maintainer

Nice! That link is currently broken - drop the /edit?

0 replies

waltercool · 2025-05-24T06:39:05Z

waltercool
May 24, 2025

Hey Scott, what is the best way to get torchvision and torchaudio from your wheel?

Also, thanks for this!

7 replies

scottt May 24, 2025
Author

@waltercool , the version.json file is here https://github.com/ROCm/TheRock/blob/main/version.json
It's a file added later on TheRock's main branch instead of my own.

waltercool May 24, 2025

I lnow that, I can see the file, but I don't think that file is being copied into the container ar build time.

Should I modify the Dockerfile to add version.json into /therock/src/version.json?

scottt May 24, 2025
Author

@waltercool , my bad. I've pushed out a fix: scottt@a36ffd1

Pass the root source directory of TheRock while building rocm_manylinux.Dockerfile :

podman build --build-arg AMDGPU_TARGETS=gfx1151 -t rocm_manylinux:latest -f dockerfiles/pytorch-dev/rocm_manylinux.Dockerfile <TheRockSource>

waltercool May 24, 2025

This worked fine! Many thanks!

waltercool Jun 9, 2025

Hi Scottt, sorry to bother again, few questions as bit stuck here. Using podman this time, to ensure my step-by-step goes well:

Those are the steps I'm running:

podman build --build-arg AMDGPU_TARGETS=gfx1151 -t rocm_manylinux:latest -f dockerfiles/pytorch-dev/rocm_manylinux.Dockerfile .
podman build --build-arg AMDGPU_TARGETS=gfx1151 -t pytorch_build_manylinux:latest -f dockerfiles/pytorch-dev/pytorch_build_manylinux.Dockerfile .
podman build --build-arg AMDGPU_TARGETS=gfx1151 -t pytorch_vision_build_manylinux:latest -f dockerfiles/pytorch-dev/pytorch_vision_dev_fedora.Dockerfile .

However:

Seems like 1st podman does everything, but output "rocm_manylinux:latest" doesn't contain /opt/rocm, there is another "untagged image" with /opt/rocm. No idea why, but can always fix it manually.
rocm.log
The 2nd podman does not work. Going to the container and running at python: import pytorch, it doesn't exist. pytorch.log
3rd podman fails, obvious reasons. But this "/bin/sh: uv: command not found" looks strange.

Do you know where is the problem?

Also, once the podman build finishes, how do I extract the wheels?

Sorry for those questions, but been trying to figure this out for a while by myself, unsuccessfully.

Chrisx1975 · 2025-05-26T19:50:59Z

Chrisx1975
May 26, 2025

Thank you for your wonderful works! Can we use triton and sageattention on this whl? I tried it on my side, but failed…

8 replies

Chrisx1975 Jun 3, 2025

Thank you for your wheel. If possible, can we generate wheel for windows with python 3.11? I have tried to build on my side.

jammm Jun 4, 2025
Collaborator

@Chrisx1975 3.11 wheels have been uploaded to https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch

Chrisx1975 Jun 4, 2025

Hi Jammm, thank you for your help…can I use 1151 wheel on 110x also? I am using 7900XT….sorry for another question…

jammm Jun 5, 2025
Collaborator

@Chrisx1975 you can find the wheels for 110x in https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x

Chrisx1975 Jun 8, 2025

Thank you for your help! I had some tests with latest 3.11 wheel and here is the result.
a) Python 3.11 + wheel + triton(from lshqqytiger) works in Comfyui. Wan video wrapper is working(almost same speed).
b) In normal operation, sadly, performance is not increased during normal Text to Image Comfyui operation.
c) If there is no need to use Wan video wrapper(or some custom node), it is better to use Python 3.12 wheel.

Bigboyblaziken · 2025-05-30T10:30:19Z

Bigboyblaziken
May 30, 2025

This is for the second(?) post: https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x

First of all, thank you so much already. Second of all, sorry for the textwall, i just wanna drop some infos and thoughts, just in case it might be helpful:

Using this for 3 days now, with a RX 7800 XT on windows 11, amd adrenaline driver 25.4.1, with python 3.12.10, in a comfyui install on the Stability Matrix program. The first day i got a few errors at the VAE encode and/or decode of latent upscales, one time i got about 5-6 amd driver-timeout errors quickly one after another, but after that it somehow magically just worked, not giving another error again. Either that error cascade... fixed something, or disabled something not needed, or maybe it just needed a PC restart which i didnt do at first, im not sure. But for the next 2 days so far it just worked now.

I went from around 1.8it/s for a raw XL gen with zluda, to around 2.4it/s, 17sec to 13sec for 32 steps, which is already a great improvement. But much more important is how much faster upscale/latent upscale/hires fix, and face detailer are for me now. Its around 2.4s/it (not it/s like before, but still fast enough for this) for a full 2x upscale hires fix run now, compared to around 8s/it before.

And that only if i was lucky, and it didnt bug out with zluda for me, which happened more and more often. Sometimes it would take around 10-20 TIMES as long for a simple hires fix, randomly, i so far could never find out what exactly was the reason, nothing i tried helped, and i couldnt find much about that issue specifically, only similar issues. Anyways, i already suspected that to be on zludas side, and it seems it very likely was...

Because since then, i have not had this bug with rocm one single time now. After that first day, ive not had another error at all so far (except one other driver-timeout because i tried a much too big/wrong upscale model, that one was on me), and it now seems to just work fast and consistently.

I just wanna make clear how much of an impact even this early version made for me: I was THIS close to get an overpriced nvidia card because i was so tired of all the problems ive had with amd on windows already, and zluda seemingly working less and less reliable for me (consistent over many re-installs, updates, downgrades, on different webuis, forks, etc.)... and then someone showed me this. And i just installed a fresh comfyui version, then pip installed 3 files. Now its faster, and the main problem with the hires fix bug seems to be gone. This alone likely saved me from throwing almost a grand at team green, just to have it work like now. Thank you all so much, i can only hope amd directly takes things like this more seriously in the future.

I kinda wanna also try to get triton and sageattention working with this, but im too happy right now to finally be able to generate normally, i might just wait. Im also 100% clueless about coding and programming, so i would just bumble around till something might or might not work... better leave it to people who actually know what theyre doing. But let me know if you need something tested with the 7800 XT, i want to try and help the little i can. Also i will help make this a quickly accessible option in the Stability Matrix app, because this kinda just... seems to completely replace zluda-based forks of webuis for at least the gpus supported by this so far. (And by help i mean that i mentioned it to the devs already, and will test what they want me to test, since im one of the few amd gpu test guys there.) C:

13 replies

Bigboyblaziken Jun 11, 2025

@jammm Heyo, just wanted to add something just in case it might help. Ive tried a bit around again, and i still get amd driver timeouts at a specific step. At the VAE encode or decode point, but only when doing a hires fix, never in the normal generation. Rarely at the encode, almost always at the decode, but so far never both times, it seems to be either or per generation.

However, the generation seems to proceed as normal and finishes after this error popup, nothing crashes or breaks (as far as i can tell), but it does freeze for a moment, sometimes blacking out both my screens for 2 seconds. The reason ive had no problems anymore so far, seems to have been because i used tiled VAE encodes and decodes in my comfyui workflow, which seems to have prevented this? So it kinda sounds like an out-of-memory problem vaes had for me often too, but actually the console shows me no errors at all, not even the "Ran out of memory, trying tiled vae decode" or whatever its called exactly. I did get that message often with zluda, however without driver timeouts, but so far i think ive never got it again now with these files. Not while using tiled vae, and not even while testing with normal vae now. Just the driver seems to complain a bit. Oh and it doesnt matter which vae it is, just tiled or not seems important.

Now, i could update the driver ofc, but im pretty sure that would still break more then it could fix atm. Im still using the amd adrenaline driver 25.4.1, which others told me is still the only one with which comfyui (or other related things) work? The 25.5.1 and the even newer 25.6.1 still seem to have problems, except if they got fixed on some side since like a week ago. I could maybe try it, at least the 25.5.1, but im still expecting it to break other things, so i didnt get it so far.

Ofc i could just continue to use all the tiled vae nodes, but i was playing around with a few custom nodes that didnt feature tiled vae, which made me aware of this problem again.

jammm Jun 12, 2025
Collaborator

The timeouts are a frequent occurence and it definitely needs attention. Thanks for reporting it!

cyber827 Jun 14, 2025

@jammm Heyo, just wanted to add something just in case it might help. Ive tried a bit around again, and i still get amd driver timeouts at a specific step. At the VAE encode or decode point, but only when doing a hires fix, never in the normal generation. Rarely at the encode, almost always at the decode, but so far never both times, it seems to be either or per generation.

However, the generation seems to proceed as normal and finishes after this error popup, nothing crashes or breaks (as far as i can tell), but it does freeze for a moment, sometimes blacking out both my screens for 2 seconds. The reason ive had no problems anymore so far, seems to have been because i used tiled VAE encodes and decodes in my comfyui workflow, which seems to have prevented this? So it kinda sounds like an out-of-memory problem vaes had for me often too, but actually the console shows me no errors at all, not even the "Ran out of memory, trying tiled vae decode" or whatever its called exactly. I did get that message often with zluda, however without driver timeouts, but so far i think ive never got it again now with these files. Not while using tiled vae, and not even while testing with normal vae now. Just the driver seems to complain a bit. Oh and it doesnt matter which vae it is, just tiled or not seems important.

Now, i could update the driver ofc, but im pretty sure that would still break more then it could fix atm. Im still using the amd adrenaline driver 25.4.1, which others told me is still the only one with which comfyui (or other related things) work? The 25.5.1 and the even newer 25.6.1 still seem to have problems, except if they got fixed on some side since like a week ago. I could maybe try it, at least the 25.5.1, but im still expecting it to break other things, so i didnt get it so far.

Ofc i could just continue to use all the tiled vae nodes, but i was playing around with a few custom nodes that didnt feature tiled vae, which made me aware of this problem again.

The 25.6.2 works fine in comfyui.

Screen blackout and freeze when decoding high resolution might be because you run out of VRAM, got it on ZLUDA and NVIDIA comfyui too when upscaling. Try to run comfyui with --reserve-vram 0.9

Bigboyblaziken Jun 15, 2025

The 25.6.2 works fine in comfyui.

Screen blackout and freeze when decoding high resolution might be because you run out of VRAM, got it on ZLUDA and NVIDIA comfyui too when upscaling. Try to run comfyui with --reserve-vram 0.9

Reserve is at 0.9 by default, but it doesnt seem to matter if i put it to 2.0, or 0, or anything else, either. Its very likely not actually out of memory with my 16GB. As i understood so far, it might just be something like: Because these custom files make the gpu driver see the pause, that seems to always happen to anyone as far as ive heard, as a problem, or something taking too long, and times out?

This at least never happened with zluda, so its not my vram running out, its just very likely a driver issue. But zluda had many many other problems, so this for me is still so much better to use that its not even comparable. So as long as this is the only "problem", im personally pretty fine with that and its not really a priority imo. C:

Hmm, with which gpu did you try the 25.6 in comfy? I might try that too and if that changes anything, but im not quite sure still if i should.

cyber827 Jun 16, 2025

I had the same problem with ZLUDA too when I chose for upscale a higher resolution. Check in Task Manager your RAM and VRAM use during the upscale and see if it overflows.

Updated to Adrenalin 25.6.2 (Optional) for RX 7900 XTX, it also increased generation time with 0.04/s

Tyruiop · 2025-06-02T05:16:40Z

Tyruiop
Jun 2, 2025

Hello, thank you for the builds, unfortunately under linux I get

ERROR: torch-2.7.0a0+gitbfd8155-cp311-cp311-linux_x86_64.whl is not a supported wheel on this platform.```

is there a way to build the wheel myself?

2 replies

ambyjkl Jun 2, 2025

this is because you are likely on python 3.13 and this build is for 3.12

Tyruiop Jun 4, 2025

Oh that must be it, thanks!

ambyjkl · 2025-06-02T20:59:41Z

ambyjkl
Jun 2, 2025

@scottt thanks for your builds, but I'm on latest python version 3.13, can you build it for that version?

2 replies

scottt Jun 3, 2025
Author

Hi @ambyjkl , do you mean Python 3.13 wheels for Linux or Windows?

antonheryanto Jul 8, 2025

thank @scottt , please generate, latest build for linux python 3.13 with aotriton

adriftthesea · 2025-06-10T14:31:28Z

adriftthesea
Jun 10, 2025

I'm trying to use the pytorch wheel for my gpu ( GFX 1201 , RX9070, Python 3.12 ) on windows. I'm getting the following error:

OSError: [WinError 1114] A dynamic link library (DLL) initialization routine failed. Error loading "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch\lib\c10.dll" or one of its dependencies.

Now I can see that the DLL file is present in the location and I've also verified dependencies using dependency walker.

28 replies

adriftthesea Jun 10, 2025

yeah there's probably something else I'm missing, is there any eta on when or if pytorch with RoCm will be available on windows?

adriftthesea Jun 10, 2025

I can build it from scratch, but then I'd be missing torchvision and torchaudio. I'm assuming they don't depend on the pytorch version?

jammm Jun 10, 2025
Collaborator

yeah there's probably something else I'm missing, is there any eta on when or if pytorch with RoCm will be available on windows?

I don't know as I am not involved in that process.
If you want to build from scratch, follow the steps in #409 (comment) to build the same pytorch2.7.0 + aotriton as I did. Then you should be able to reuse the vision/audio wheels provided here.

ScottTodd Jun 10, 2025
Maintainer

is there any eta on when or if pytorch with RoCm will be available on windows?

Here's what we've said so far: #651 (comment)

adriftthesea Jun 10, 2025

that's great. but Is there a prebuilt aotriton kernel for my gpu gfx1201, i only see 1151 there. should I build it on linux otherwise,

lgarbarini · 2025-06-12T02:14:31Z

lgarbarini
Jun 12, 2025

Thanks for the great work on this! I have been using Fedora 42 with torch-2.7.0a0+gitbfd8155-cp311-cp311-linux_x86_64.whl and have been using pytorch successfully for the last few days, but have encountered some kernel soft lockups when trying to use more advanced code with pytorch deps (e.g. https://github.com/wasserth/TotalSegmentator). Not sure if this is a known issue already, but happy to provide additional debug info if useful. I'll try to isolate the routine that causes the soft lock up in the next few days.

watchdog: BUG: soft lockup - CPU#18 stuck for 26s! [TotalSegmentato:5939]
CPU#18 Utilization every 4s during lockup:
        #1: 100% system,          0% softirq,          0% hardirq,          0% idle
        #2: 100% system,          0% softirq,          0% hardirq,          0% idle
        #3: 100% system,          0% softirq,          1% hardirq,          0% idle
        #4: 100% system,          0% softirq,          0% hardirq,          0% idle
        #5: 100% system,          0% softirq,          0% hardirq,          0% idle
Modules linked in: xt_MASQUERADE xt_mark nft_compat binfmt_misc tun overlay uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntr>
 edac_mce_amd snd_hda_scodec_component snd_hda_codec_hdmi mt76_connac_lib soundwire_amd soundwire_generic_allocation soundwire_bus mt76 snd_hda_intel>
 drm_panel_backlight_quirks drm_buddy nvme_core drm_display_helper polyval_clmulni polyval_generic amdxdna ghash_clmulni_intel video sha512_ssse3 sha>
CPU: 18 UID: 1000 PID: 5939 Comm: TotalSegmentato Not tainted 6.14.9-300.fc42.x86_64 #1
Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.02.01 03/05/2025
RIP: 0010:native_queued_spin_lock_slowpath+0x29a/0x310
Code: 48 c1 e0 05 48 63 f6 48 05 c0 7d 03 00 48 81 fe 00 20 00 00 73 43 48 03 04 f5 80 5e fa 8f 48 89 10 8b 42 08 85 c0 75 09 f3 90 <8b> 42 08 85 c0 >
RSP: 0018:ffffd00583c63bb8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88d04da5b9f8 RCX: ffff88d04610096c
RDX: ffff88db8df37dc0 RSI: 0000000000000011 RDI: ffff88d04610096c
RBP: ffff88d046100960 R08: 00000000004c0000 R09: ffff88db8df00000
R10: 00000000004c0000 R11: fffffaadcbadd580 R12: ffff88d046478000
R13: ffff88d051c205c0 R14: ffff88d22b756e70 R15: 000000000006b2fa
FS:  0000000000000000(0000) GS:ffff88db8df00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007faaabeb0170 CR3: 000000048082c000 CR4: 0000000000f50ef0
PKRU: 55555554
Call Trace:
 <TASK>
 _raw_spin_lock+0x29/0x30
 mmu_interval_notifier_remove+0x42/0x1e0
 amdxdna_gem_vm_close+0x37/0x60 [amdxdna]
 remove_vma+0x2c/0x70
 exit_mmap+0x1c0/0x3c0
 __mmput+0x41/0x120
 exit_mm+0xb1/0x110
 do_exit+0x20b/0x4a0
 do_group_exit+0x2d/0xc0
 __x64_sys_exit_group+0x18/0x20
 x64_sys_call+0xff0/0x1500
 do_syscall_64+0x7b/0x160
 ? __count_memcg_events+0xc0/0x1e0
 ? count_memcg_events.constprop.0+0x1a/0x30
 ? handle_mm_fault+0x227/0x340
 ? do_user_addr_fault+0x36c/0x640
 ? exc_page_fault+0x7e/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fad98440da8
Code: Unable to access opcode bytes at 0x7fad98440d7e.
RSP: 002b:00007fff34429958 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000561081b75408 RCX: 00007fad98440da8
RDX: 0000000000000001 RSI: ffffffffffffff88 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000561081a5cbb0 R09: 7fffffffffffffff
R10: 0000000000000000 R11: 0000000000000202 R12: 00005610ce3c0286
R13: 0000000000000000 R14: 0000000000000001 R15: 00007fad983327a0
 </TASK>

10 replies

lgarbarini Jun 15, 2025

@scottt, now I think I'm getting to real gfx1151 + MIOpen issues after using the 6.16-rc1 kernel drivers:

Resampling...
  Resampled in 2.04s
Predicting part 1 of 2 ...
  0%|                                                                                                                                                          | 0/6 [00:00<?, ?it/s]MIOpen(HIP): Info [get_device_name] Raw device name: gfx1151
MIOpen(HIP): Info [Handle] stream: 0, device_id: 0
MIOpen(HIP): Info [get_device_name] Raw device name: gfx1151
MIOpen(HIP): Info [SetStream] stream: 0, device_id: 0
MIOpen(HIP): Info [] MIOPEN_FIND_MODE = DYNAMIC_HYBRID(5)
MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, HIP version 6.5.25190, MIOpen version 3.4.0.70a79c373
MIOpen(HIP): Info [GetWorkSpaceSize] 123863040
MIOpen(HIP): Info [FindConvFwdAlgorithm] requestAlgoCount = 1, workspace = 123863040
MIOpen(HIP): Info [IsNetworkedFilesystem] Filesystem type at '"/home/logan/.config/miopen/"' is: 0x9123683e '<Unknown magic>'
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.768365 ms
MIOpen(HIP): Info [TryLoad] Find-db regenerating.
MIOpen(HIP): Info [FindSolutionImpl] ConvDirectNaiveConvFwd (not searchable)
MIOpen(HIP): Info [FindSolutionImpl] GemmFwdRest (not searchable)
MIOpen(HIP): Info [IsNetworkedFilesystem] Filesystem type at '"/home/logan/.cache"' is: 0x9123683e '<Unknown magic>'
MIOpen(HIP): Info [KernDb] database not present
MIOpen(HIP): Info [PrintVersion] HIPRTC v.9.0
MIOpen(HIP): Info [KernDb] database not present
MIOpen(HIP): Info [EvaluateInvokers] ConvDirectNaiveConvFwd: naive_conv_ab_nonpacked_fwd_ncdhw_half_double_half: 60.7334 < 3.40282e+38
MIOpen(HIP): Info [EvaluateInvokers] Selected: ConvDirectNaiveConvFwd: naive_conv_ab_nonpacked_fwd_ncdhw_half_double_half: 60.7334, workspace_sz = 0
MIOpen(HIP): Info [KernDb] database not present
MIOpen(HIP): Info [PrintVersionImpl] COMgr v.3.0.0
MIOpen(HIP): Info [KernDb] database not present
MIOpen(HIP): Info [EvaluateInvokers] GemmFwdRest: : 8.38309 < 3.40282e+38
MIOpen(HIP): Info [EvaluateInvokers] Selected: GemmFwdRest: : 8.38309, workspace_sz = 123863040
MIOpen(HIP): Info [FindConvolution] miopenConvolutionFwdAlgoGEMM	8.38309	123863040
MIOpen(HIP): Info [FillFindReturnParameters] FW Chosen Algorithm: GemmFwdRest , 123863040, 8.38309
MIOpen(HIP): Info [ConvolutionForward] algo = 0, workspace = 123863040
MIOpen(HIP): Info [get_device_name] Raw device name: gfx1151
MIOpen(HIP): Info [SetStream] stream: 0, device_id: 0
MIOpen(HIP): Info [FindSolutionImpl] BnFwdTrainingSpatial
MIOpen(HIP): Info [Measure] ReadonlyRamDb::Prefetch time: 7.1e-05 ms
MIOpen(HIP): Info [Prefetch] File is unreadable: "/home/logan/.config/miopen/batchnorm_gfx1151_16.HIP.3_4_0_70a79c373.udb.txt"
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.016418 ms
MIOpen(HIP): Info [FindSolutionImpl] Perf Db: record not found for: BnFwdTrainingSpatial
MIOpen(HIP): Info [GetDefaultPerformanceConfig] Variant1-1
MIOpen(HIP): Info [KernDb] database not present
<inline asm>:14:20: error: not a valid operand.
v_add_f32 v4 v4 v4 row_bcast:15 row_mask:0xa
                   ^
<inline asm>:15:20: error: not a valid operand.
v_add_f32 v3 v3 v3 row_bcast:15 row_mask:0xa
                   ^
<inline asm>:17:20: error: not a valid operand.
v_add_f32 v4 v4 v4 row_bcast:31 row_mask:0xc
                   ^
<inline asm>:18:20: error: not a valid operand.
v_add_f32 v3 v3 v3 row_bcast:31 row_mask:0xc
                   ^
MIOpen(HIP): Error [Do] 'amd_comgr_do_action(kind, handle, in.GetHandle(), out.GetHandle())' AMD_COMGR_ACTION_CODEGEN_BC_TO_RELOCATABLE: ERROR (1)
MIOpen(HIP): Error [BuildOcl] comgr status = ERROR (1)
MIOpen(HIP): Warning [BuildOcl] error: cannot compile inline asm
error: cannot compile inline asm
error: cannot compile inline asm
error: cannot compile inline asm
4 errors generated.

MIOpen Error: /therock/src/ml-libs/MIOpen/src/hipoc/hipoc_program.cpp:299: Code object build failed. Source: MIOpenBatchNormFwdTrainSpatial.cl
  0%|                                                                                                                                                          | 0/6 [00:03<?, ?it/s]
  0%|                                                                                                                                                          | 0/6 [00:00<?, ?it/s]MIOpen(HIP): Info [get_device_name] Raw device name: gfx1151

Are more patches required to some of these files?

src/kernels/MIOpenBatchNormActivBwdPerAct.cl
src/kernels/MIOpenBatchNormActivBwdSpatial.cl
src/kernels/MIOpenBatchNormActivFwdTrainSpatial.cl
src/kernels/MIOpenBatchNormBwdSpatial.cl
src/kernels/MIOpenBatchNormFwdTrainSpatial.cl

and maybe their associated build params in:

src/solver/batchnorm/backward_per_activation.cpp
src/solver/batchnorm/backward_per_activation_fused.cpp
src/solver/batchnorm/backward_spatial.cpp
src/solver/batchnorm/forward_inference.cpp
src/solver/batchnorm/forward_per_activation.cpp
src/solver/batchnorm/forward_per_activation_fused.cpp
src/solver/batchnorm/forward_spatial.cpp

stellaraccident Jun 15, 2025
Maintainer

Can you capture this state for the miopen problem into a new issue so I can route to the miopen devs? Afaik, it should compile cleanly at our current revisions.

stellaraccident Jun 15, 2025
Maintainer

Ah, I see this may be jit compiling so should show up in failed tests. Repro would help

lgarbarini Jun 16, 2025

@stellaraccident do you want the new issue on TheRock or MIOpen?

stellaraccident Jun 16, 2025
Maintainer

Here is fine. It is a bit easier to track end user problems at the top level. They all go into the same GH project

cyber827 · 2025-06-14T19:14:14Z

cyber827
Jun 14, 2025

Thanks, got it working on a RX 7900 XTX on Windows comfyui running Flux Dev

It works good when running it with --use-quad-cross-attention (2.39s/it for 1336x768 and 1GB lora) but when using --use-pytorch-cross-attention it's slow (37.40s/it). It's not a problem since quad cross attention works fine but is it normal for pytorch cross attention to be slower?

Is there any way to make the --fast option work with it in comfyui? The code seems to modify pytorch in some way.

https://github.com/comfyanonymous/ComfyUI/blob/29596bd53fd1dde0f2a53e462318fb1348fc7f1d/comfy/cli_args.py#L144

class PerformanceFeature(enum.Enum):
    Fp16Accumulation = "fp16_accumulation"
    Fp8MatrixMultiplication = "fp8_matrix_mult"
    CublasOps = "cublas_ops"

parser.add_argument("--fast", nargs="*", type=PerformanceFeature, help="Enable some untested and potentially quality deteriorating optimizations. --fast with no arguments enables everything. You can pass a list specific optimizations if you only want to enable specific ones. Current valid optimizations: fp16_accumulation fp8_matrix_mult cublas_ops")

4 replies

Chrisx1975 Jun 14, 2025

Hi, if you are using 7900xtx, as far as I know, fp8 is not supported in HW level…You may use other model or method…(9070XT supports fp8…)

cyber827 Jun 15, 2025

I'd use the 5070ti but like 9070XT only 16GB is not enough VRAM to load a better model that will offset the quality drop. The 24GB of the XTX would be really useful for the --fast arg or something similar to it adapted.

Kademo15 Jun 16, 2025

@cyber827 Use the gguf versions for flux because as already said fp8 doesnt work. I run chroma at 3s/it at q8 with pytorch cross attention and chroma is de-distilled so flux should be faster than that.

Bazza-63 Jun 19, 2025

RDNA3 supports int8, and if you wanna give it a try, MiGraphX will convert it and optmise the execution of the model.

Chrisx1975 · 2025-06-26T20:06:40Z

Chrisx1975
Jun 26, 2025

Thanks! I realized that RNDA3 pytorch wheel release is almost ready…if possible, could you provide latest wheel? I would like to have Comfyui performance comparison…

0 replies

tiborkiss · 2025-07-07T01:44:54Z

tiborkiss
Jul 7, 2025

I've just installed in Python 3.12 (with Miniconda) on a GMKtec EVO X2 and I saw some numpy incompatibility.

import torch print(torch.utils.collect_env.get_env_info())

i get

`
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

(Triggered internally at D:\src\torch\csrc\utils\tensor_numpy.cpp:81.)
cpu = _conversion_method_template(device=torch.device("cpu"))
SystemEnv(torch_version='2.7.0a0+git3f903c3', is_debug_build='False', cuda_compiled_version='N/A', gcc_version=None, clang_version=None, cmake_version=None, os='Microsoft Windows 11 Pro (10.0.26100 64-bit)', libc_version='N/A', python_version='3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:29:09) [MSC v.1943 64 bit (AMD64)] (64-bit runtime)', python_platform='Windows-11-10.0.26100-SP0', is_cuda_available='True', cuda_runtime_version=None, cuda_module_loading='LAZY', nvidia_driver_version=None, nvidia_gpu_models='AMD Radeon(TM) 8060S Graphics (gfx1151)', cudnn_version=None, pip_version='pip3', pip_packages='numpy==2.3.1\ntorch==2.7.0a0+git3f903c3\ntorchaudio==2.6.0a0+1a8f621\ntorchvision==0.22.0+9eb57cd', conda_packages='numpy 2.3.1 pypi_0 pypi\ntorch 2.7.0a0+git3f903c3 pypi_0 pypi\ntorchaudio 2.6.0a0+1a8f621 pypi_0 pypi\ntorchvision 0.22.0+9eb57cd pypi_0 pypi', hip_compiled_version='6.5.25205-c1c2abe52', hip_runtime_version='6.5.25205', miopen_runtime_version='3.4.0', caching_allocator_config='', is_xnnpack_available='True', cpu_info='Name: AMD RYZEN AI MAX+ 395 w/ Radeon 8060S \nManufacturer: AuthenticAMD\nFamily: 107\nArchitecture: 9\nProcessorType: 3\nDeviceID: CPU0\nCurrentClockSpeed: 3000\nMaxClockSpeed: 3000\nL2CacheSize: 16384\nL2CacheSpeed: None\nRevision: 28672')
`

0 replies

tiborkiss · 2025-07-07T02:31:40Z

tiborkiss
Jul 7, 2025

Then I installed ComfyUI with requirements.txt.

Checkpoint files will always be loaded safely. Total VRAM 110457 MB, total RAM 32407 MB pytorch version: 2.7.0a0+git3f903c3 AMD arch: gfx1151 ROCm version: (6, 5) Set vram state to: NORMAL_VRAM Device: cuda:0 AMD Radeon(TM) 8060S Graphics : native Using pytorch attention Python version: 3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:29:09) [MSC v.1943 64 bit (AMD64)] ComfyUI version: 0.3.43 ****** User settings have been changed to be stored on the server instead of browser storage. ****** ****** For multi-user setups add the --multi-user CLI argument to enable multiple user profiles. ****** ComfyUI frontend version: 1.23.4

Then I load a basic image generation model and I get

!!! Exception during processing !!! Numpy is not available Traceback (most recent call last): File "C:\AI\ComfyUI\execution.py", line 361, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI\execution.py", line 236, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI\execution.py", line 208, in _map_node_over_list process_inputs(input_dict, i) File "C:\AI\ComfyUI\execution.py", line 197, in process_inputs results.append(getattr(obj, func)(**inputs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI\ComfyUI\nodes.py", line 1584, in save_images i = 255. * image.cpu().numpy() ^^^^^^^^^^^^^^^^^^^ RuntimeError: Numpy is not available

but in reality, pip freeze shows me NumPy is installed.
numpy==2.3.1

I think this numpy incompatibility relates to the message when I see by just importing torch.

3 replies

stellaraccident Jul 7, 2025
Maintainer

The numpy 1.x vs 2.x upgrade has caused problems everywhere. My guess is that exactly as the error says, torch was built with 1.x and 2.x is installed on your venv. Try downgrading numpy to match.

There is almost always a rationale for these issues, but for historic releases, the most pragmatic thing is just to locally downgrade. Too many requirements files on the internet were not written to take the breaking change into account.

tiborkiss Jul 7, 2025

I downgraded numpy to
pip install numpy==1.26.4

and now I solved both problems.

tiborkiss Jul 8, 2025

I return with a comment, just to confirm that with this distribution I was able to run ComfyUI and various models, such as Flux.1 and others without any issue.

puzhengwu · 2025-07-10T10:22:20Z

puzhengwu
Jul 10, 2025

(p312) PS D:\work\python\torch> python .\test.py
PyTorch 版本: 2.7.0a0+git3f903c3
是否可用 ROCm: False
当前设备: AMD Radeon(TM) 8060S Graphics

rocm6.5rc在哪里下载

0 replies

samar1tan · 2025-07-13T21:03:15Z

samar1tan
Jul 13, 2025

After compiling everywhere but failed to even run a basic torch test, you saved my day. Really appreciated!

0 replies

AznamirWoW · 2025-07-25T15:01:16Z

AznamirWoW
Jul 25, 2025

What else needs to be installed on Windows besides torch and torchaudio wheels?
getting:
env\Lib\site-packages\torch\nn\modules\conv.py', line 370, in _conv_forward: return F.conv1d( RuntimeError: miopenStatusUknownError

MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver <GemmFwdRest>, workspace required: 983040, provided ptr: 0000000000000000 size: 0 MIOpen Error: D:/jam/TheRock/ml-libs/MIOpen/src/ocl/convolutionocl.cpp:275: No suitable algorithm was found to execute the required convolution

3 replies

jammm Jul 25, 2025
Collaborator

What happens if you try with the env var MIOPEN_DEBUG_CONV_DIRECT_NAIVE_CONV_FWD=0 ?

Nem404 Jul 25, 2025

In my case I needed to set MIOPEN_DEBUG_CONV_DIRECT=0. Then it started to work

AznamirWoW Jul 25, 2025

What happens if you try with the env var MIOPEN_DEBUG_CONV_DIRECT_NAIVE_CONV_FWD=0 ?

I've made a ticket
#1126
adding torch.backends.cudnn.enabled = False seems to solve the error, but not sure if it just worked randomly

mihongyu · 2025-07-30T11:31:32Z

mihongyu
Jul 30, 2025

hi, Guys, who can tell me where to download ROCM6.5 rc? Thank you.

2 replies

0xDELUXA Jul 30, 2025

We only have 6.4.2 rn. Or 7.0.0 from TheRock

0xDELUXA Jul 30, 2025

Or we have this "unofficial" HIP SDK 6.5.2 develop: https://github.com/lshqqytiger/TheRock/releases/tag/build0

ashwin3005 · 2025-08-04T10:55:55Z

ashwin3005
Aug 4, 2025

@scottt @jammm
I had tried to download the .whl file to install rocm and pytorch for Ubuntu 24.04. from :https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch
But the download is failing after 80%

could you help addressing that?

4 replies

ashwin3005 Aug 4, 2025

I am getting
ERROR 618: jwt:expired when I use wget to download that.

jammm Aug 4, 2025
Collaborator

It sounds like a github specific issue tbh. The wheels are uploaded on github servers. I would try again after some time, or contact github support for assistance.

jammm Aug 4, 2025
Collaborator

FWIW I'm able to download it locally here.

ashwin3005 Aug 4, 2025

Thanks a bunch, @jammm !

I was able to get the .whl files locally. It seems like the JWT tokens are expiring before the downloads finish. I managed to download them by switching to a different network.

Waxford44 · 2025-08-12T04:08:32Z

Waxford44
Aug 12, 2025

Download the gfx1151 from https://github.com/scottt/rocm-TheRock/releases/v6.5.0rc-pytorch Download the Windows gfx110x and gfx1201 wheel from https://github.com/scottt/rocm-TheRock/releases/tag/v6.5.0rc-pytorch-gfx110x

These are built by @jammm and myself (@scottt)

Features and Known Problems

torch.nn.functional.scaled_dot_product_attention backed by aotriton 0.9.2

No need to set any environment variables

The Linux wheel is bigger than it should be due to symlink becoming duplicated files and inclusion of unnecessary files

I'm trying to install stable diffusion automatic1111 or forge with these wheels. Problem I am encountering, automatic1111 and forge were made with python 3.10.6.
These wheels run on python 3.12. So errors start appearing. Comfyui is too complicated, I need a simpler stable diffusion fork. So my question is: is there a middle ground? Like something that runs both on python 3.11?

Despite this text in the rock
Strix Halo (gfx1151) GPU support
Pytorch 2.7
ROCm 6.5.0rc
Built for Python 3.11

I tried running them on python 3.11.9 but didn't work. Any options for other SD forks? Like I said comfyui is way too complicated to use, I need simple interface

7 replies

Waxford44 Aug 12, 2025

use reforge: https://github.com/Panchovix/stable-diffusion-webui-reForge

I will try this, thank you for pointing this out

Waxford44 Aug 12, 2025

I believe you can use stable diffusion with python 3.11 https://github.com/AUTOMATIC1111/stable-diffusion-webui has instruction in there for python 3.11. Get the wheels that say cp311

If the reforge method fails I will try this, thank you for your advice

Waxford44 Aug 12, 2025

use reforge: https://github.com/Panchovix/stable-diffusion-webui-reForge

Hey I just wanted to let you know this works. By specifically using that fork it works and I managed to install the custom wheels from the rock. Thank you very much

LuXuxue Aug 12, 2025

I wonder how fast can gfx1151 run.
I can get 2 s/it @ 832x1216(SDXL) on 780M(gfx1103), and i'm not sure which one should i buy next.

ojamin Aug 12, 2025

fyi, I have comfyui doing image & vid gen with gfx1151 and it works great - see my comment below for what I installed
#655 (comment)

ojamin · 2025-08-12T21:33:20Z

ojamin
Aug 12, 2025

For anyone coming here looking for gfx1151 things, TheRock's release docs now have full ROCm and Torch Python packages:

https://github.com/ROCm/TheRock/blob/main/RELEASES.md

python -m pip install --index-url https://d2awnip2yjpvqn.cloudfront.net/v2/gfx1151/ rocm[libraries,devel]

python -m pip install --index-url https://d2awnip2yjpvqn.cloudfront.net/v2/gfx1151/ torch torchaudio torchvision pytorch-triton-rocm numpy

Until recently I was using everything from this thread with custom built extra torch stuff and it was working pretty well other than some stability issues when heavily loaded.

Now I've migrated to the above and can confirm that a AMD RYZEN AI MAX+ 395 w/ Strix Halo Radeon 8060S works well and is now more stable than it was running various LLM's and other AI workloads, including image/video gen via Torch + ROCm.

I've additionally got it running in a VM with PCIe pass through with Proxmox and its all working pretty smooth now. Just wanted to leave this here for anyone who comes after.

Other releases on that page as of writing are: gfx94X-dcgpu, gfx950-dcgpu, gfx110X-dgpu, gfx120X-all

9 replies

Nem404 Aug 14, 2025

@RSabbagh52 I have the same issue - in my case, an image-to-video workflow using ComfyUI. AOTriton's dev confirmed that once we get it included in the CI wheels, the attention mechanisms should help with OOM errors: #1040 (comment)

RSabbagh52 Aug 14, 2025

Thank you Nem404!

Nem404 Aug 14, 2025

Np, and it’s not going to take that long, see #655 (reply in thread)

ojamin Aug 14, 2025

@RSabbagh52 I have the same box :) what split have you got the ram/vram in the bios set to?

I've got mine running in 64gb/64gb as most of the time I cant get larger than ram models to load into vram so having 96gb vram isnt that useful (except for having several models cached when using Ollama serve or similar). An exception to that was in LMStudio where I could reliably get it to load 80gb of model into vram. The half and half split seems to work well in ComfyUI doing text/image to video like your wanting though.

RSabbagh52 Aug 14, 2025

My response is at the end. Sorry.

AznamirWoW · 2025-08-13T14:24:41Z

AznamirWoW
Aug 13, 2025

Tried to figure out why performance of the official builds are so bad...
bench.txt

24 replies

Nem404 Aug 15, 2025

Setting torch.backends.cudnn.enabled = False on AMD disables MIOpen entirely, AFAIK. So if we leave it at the default True and still get those errors, then this is clearly a MIOpen issue rn.

AznamirWoW Aug 15, 2025

Setting torch.backends.cudnn.enabled = False on AMD disables MIOpen entirely, AFAIK. So if we leave it at the default True and still get those errors, then this is clearly a MIOpen issue rn.

I meant setting it to false for 4070 results in the same terrible numbers as with 9070 with miopen

0xDELUXA Aug 15, 2025

I meant setting it to false for 4070 results in the same terrible numbers as with 9070 with miopen

Yeah because that way your 4070 skips cuDNN entirely and falls back to other (slower) implementations.
The 9070 gives those numbers because I think MIOpen doesnt have 100% support for it (on Windows) as of now.

AznamirWoW Aug 15, 2025

I meant setting it to false for 4070 results in the same terrible numbers as with 9070 with miopen

Yeah because that way your 4070 skips cuDNN entirely and falls back to other (slower) implementations. The 9070 gives those numbers because I think MIOpen doesnt have 100% support for it (on Windows) as of now.

except that I've tested 7900xtx with ubunty (WSL) and the numbers are bad there as well.

0xDELUXA Aug 15, 2025

Then Idk. Lets see what the devs say

RSabbagh52 · 2025-08-14T22:16:43Z

RSabbagh52
Aug 14, 2025

Mine is at 64/64 as well. Sure will be nice when fixed. Even some of the text to image blows up if upscaled to a higher resolution. It freaks out at the end of the workflow. Walks all over display memory. But if I change the upscale to be the same size as the input, it works fine.

…

On Thu, Aug 14, 2025, 1:29 PM Ben Jamin ***@***.***> wrote: @RSabbagh52 <https://github.com/RSabbagh52> I have the same box :) what split have you got the ram/vram in the bios set to? I've got mine running in 64gb/64gb as most of the time I cant get larger than ram models to load into vram so having 96gb vram isnt that useful (except for having several models cached when using Ollama serve or similar). An exception to that was in LMStudio where I could reliably get it to load 80gb of model into vram. The half and half split seems to work well in ComfyUI doing text/image to video like your wanting though. — Reply to this email directly, view it on GitHub <#655 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANE7KUEQXILVLAEWIZVUUBL3NTWSTAVCNFSM6AAAAAB5OD6YSCVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMJQHEYDGNY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

mihongyu · 2025-08-15T12:35:02Z

mihongyu
Aug 15, 2025

Hey guys, today，when I was using the rocm7 rc20250814 version, there was a core dumped when I started comfyui. Before this version, it could start normally... I am very sad……When will a stable version appear? After all, I have been buying AI Max 395 for many days now(ㄒoㄒ)

0 replies

precas360 · 2025-08-18T19:34:50Z

precas360
Aug 18, 2025

Hello, and thank you for your incredible work on developing and providing these PyTorch builds.

I am trying to get ComfyUI running on a Windows PC with a Strix Point APU (Radeon 890M, gfx1150), but I'm encountering a persistent HIP error: invalid device function during image generation (KSampler execution) that I haven't been able to solve. I was hoping to get some advice.

Environment

APU: AMD Ryzen AI 9 HX 370 (Radeon 890M, gfx1150)

OS: Windows 11

Python: 3.12.10 (in a venv)

HIP SDK: 5.7.1 (with lshqqytiger's DLL patch applied)

PyTorch Build: I have tried several native ROCm builds from this repository, including v6.5.0rc and v6.0.0.

Problem Description

ComfyUI starts up and loads models correctly. However, when the KSampler begins the image generation process, it always fails with the following error:
RuntimeError: HIP error: invalid device function

What I've Tried

I have tried nearly every possible workaround to resolve this error:

Environment Variable: Set HSA_OVERRIDE_GFX_VERSION=11.0.0.

Command Line Arguments:

--use-pytorch-cross-attention (to switch the attention implementation)

--force-fp32 (to force 32-bit precision)

--disable-ipex-optimize (to disable Intel optimizations)

ComfyUI Settings:

Tried multiple samplers and schedulers (e.g., euler, dpmpp_2m, karras).

Library Dependencies: Resolved all version conflicts with NumPy and OpenCV.

PyTorch Builds: Tested multiple .whl versions from this repository, but the result was identical.

Question

Since the error persists after all these countermeasures, I suspect there might still be a fundamental incompatibility between the current PyTorch builds and the gfx1150 architecture.

Is there anything else I could try, or any point I might have missed? Any help or insight would be greatly appreciated.

4 replies

Nem404 Aug 19, 2025

HIP SDK: 5.7.1 (with lshqqytiger's DLL patch applied)

I think this isn’t the right place to ask about "unofficial" patches and such.

PyTorch Build: I have tried several native ROCm builds from this repository, including v6.5.0rc and v6.0.0.

AFAIK Windows wheels built by CI are all 7.0.0 by now.

Edit: Oh, you mean scottt and jammm’s old wheels. Those won’t be patched or updated - we need to migrate to the new, official wheels. Sad that gfx1150 isn’t supported yet.

precas360 Aug 20, 2025

Thank you for the quick reply and clarification.

I understand the situation now. Thanks for confirming that the wheels I was using are outdated and that the new official wheels don't yet support gfx1150.

Thanks again for the valuable information!

Nem404 Aug 20, 2025

No problem @precas360, but the only person who could actually help you with your issue is @jammm

jammm Aug 20, 2025
Collaborator

The wheels here don't support gfx1150 either. You can either build TheRock yourself, or wait for gfx1150 support in the nightly wheels.

Self-contained Navi 3(gfx110x) and Strix Halo (gfx1151) Pytorch Wheels #655

Uh oh!

Uh oh!

Replies: 23 comments · 130 replies

Uh oh!

ScottTodd May 19, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

scottt May 24, 2025 Author

Uh oh!

Uh oh!

Uh oh!

scottt May 24, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jammm Jun 4, 2025 Collaborator

Uh oh!

Uh oh!

jammm Jun 5, 2025 Collaborator

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jammm Jun 12, 2025 Collaborator

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scottt Jun 3, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 23 comments 130 replies

ScottTodd
May 19, 2025
Maintainer

scottt May 24, 2025
Author

scottt May 24, 2025
Author

jammm Jun 4, 2025
Collaborator

jammm Jun 5, 2025
Collaborator

jammm Jun 12, 2025
Collaborator

scottt Jun 3, 2025
Author