Closed
Description
I think is a good idea start to think how to implement this sort of layer in Keras.
I know that is a really fresh algorithm, but I believe that's a new cutting edge tech in Deep Learning for the next years.
Paper: Attention is all you need (https://arxiv.org/abs/1706.03762)
Blog showing some results: Google Research Blog
Tensor2Tensor library tensor2tensor
Pytorch implementation pytorch-t2t
Metadata
Metadata
Assignees
Labels
No labels