Skip to content

Multi Head Attention Layer #7803

Closed
Closed
@grafael

Description

@grafael

I think is a good idea start to think how to implement this sort of layer in Keras.
I know that is a really fresh algorithm, but I believe that's a new cutting edge tech in Deep Learning for the next years.

Paper: Attention is all you need (https://arxiv.org/abs/1706.03762)

Blog showing some results: Google Research Blog
Tensor2Tensor library tensor2tensor
Pytorch implementation pytorch-t2t

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions