This repository was archived by the owner on Jun 23, 2025. It is now read-only.
This repository was archived by the owner on Jun 23, 2025. It is now read-only.
Swish activation doesn't save the weight if beta is not trainable #482
Closed
Description
It is very useful to always save the beta value in the weights file even if beta is not trainable. It is useful when converting to a light weight inference package such as LWTNN (https://github.com/lwtnn/lwtnn).
I already have an implementation of swish activation that preserves all the features of the current one but also saves the untrainable beta in the weights file and I would like to create a pull request.
Metadata
Metadata
Assignees
Labels
No labels