Skip to content

Commit 874bc3e

Browse files
committed
[Perf] AdamWConfig: enable fused AdamW kernel
Pass ``fused=True`` to ``torch.optim.AdamW`` on the non-foreach path. The fused kernel folds the per-parameter element-wise AdamW update into a single CUDA launch, cutting launch overhead and optimizer-step time at large parameter counts.
1 parent 8a0efa8 commit 874bc3e

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

xtuner/v1/config/optim.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def build(self, model):
6363
foreach=self.foreach,
6464
)
6565
return torch.optim.AdamW(
66-
params, lr=self.lr, betas=self.betas, eps=self.eps, weight_decay=self.weight_decay, foreach=self.foreach
66+
params, lr=self.lr, betas=self.betas, eps=self.eps, weight_decay=self.weight_decay, foreach=self.foreach, fused=True,
6767
)
6868

6969

0 commit comments

Comments
 (0)