You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This can be safely replaced with .expand(-1, x.shape[-1]), which produces the same result but avoids the unnecessary memory duplication caused by repeat(). Since this index tensor is not modified after creation, .expand() provides a more memory- and compute-efficient way to achieve the same broadcasting effect.
✅ Suggested Change
exp_token_idx.view(-1, 1).expand(-1, x.shape[-1])
✅ Benefits
🧠 Reduces memory usage (especially when x.shape[-1] is large)
🚀 Slightly improves performance (no data duplication)
🧩 Same behavior and output as before
📦 Cleaner and more efficient indexing logic
Note: This is safe because the tensor is only used as a read-only index, and expand() creates a broadcasted view without copying data.