Support layernorm recompute for fused hstu layer by JacoCheung · Pull Request #59 · NVIDIA/recsys-examples

JacoCheung · 2025-06-05T01:49:17Z

Description

This PR addresses #6 . Now only input layer norm is recomputed, which incurs slight ~1% perf drop in backward on A100-PCIe-80G given dim_per_heads=128, num_heads=4,seqlen=4096, batchsize=32, embedding_dim=512.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

JacoCheung · 2025-06-05T01:51:38Z

please track CI status.

JacoCheung · 2025-06-06T07:03:58Z

Updated CI

junzhang added 2 commits June 4, 2025 10:20

Enable hstu recompute input layernorm

658f7e1

[Doc]Complete HSTUConfig docstring

faa1352

JacoCheung requested a review from shijieliu June 5, 2025 01:51

shijieliu reviewed Jun 5, 2025

View reviewed changes

Comment thread examples/commons/utils/clear_tensor_data.py

Comment thread examples/hstu/configs/hstu_config.py

Comment thread examples/hstu/ops/fused_hstu_op.py

junzhang added 2 commits June 6, 2025 01:23

Expose recompute ln to gin config

2132a3d

Remove TRITON test

d82b53a

shijieliu approved these changes Jun 9, 2025

View reviewed changes

shijieliu merged commit f3b6798 into NVIDIA:main Jun 9, 2025

shijieliu mentioned this pull request Jun 12, 2025

[FEA] Support recompute for HSTU #6

Closed

3 tasks

JacoCheung deleted the junzhang/recompute_layernorm branch February 2, 2026 03:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support layernorm recompute for fused hstu layer#59

Support layernorm recompute for fused hstu layer#59
shijieliu merged 4 commits intoNVIDIA:mainfrom
JacoCheung:junzhang/recompute_layernorm

JacoCheung commented Jun 5, 2025

Uh oh!

JacoCheung commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JacoCheung commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JacoCheung commented Jun 5, 2025

Description

Checklist

Uh oh!

JacoCheung commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JacoCheung commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants