-
Notifications
You must be signed in to change notification settings - Fork 462
Argmax Softmax #627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argmax Softmax #627
Conversation
I like the idea, but I think the implementation could be handled differently. Am I right thinking that this "strategy argmax" effectively replaces the softmax function with a linear activation? I think also it would not only be softmax that can be replaced in this way, but any activation function that meets the criteria you described (which is most of them). |
I agree with Sioni's comments and I would add there could be two different optimizer passes depending on the choice of strategy:
|
I agree with the comments, the reason I didn't remove the Softmax layer completely is because in the test we have a simple one-layer network, which if removed, wouldn't really work. @thesps you're right, Softmax is essentially replaced with linear activation in this PR. My proposed change is then two "implementations":
I would avoid having a custom Keras layer, for two reasons:
|
Apologies for the late followup review. This looks really nice now, all these implemented options are useful. The problems now are:
|
Since Benjamin is back at university now, I resolved the two issues, can you merge now @thesps ? |
Argmax Softmax
A# Description
Two implementations are added:
Type of change
Tests
Checklist