Transformer

Simple Transformer implementation from scratch in pytorch.

Classification Transformer

Generation Transformer

The original transformer: encoders and decoders

Limitations

The current models are designed to show the simplicity of transformer models and self-attention. As such they will not scale as far as the bigger transformers. For that you'll need a number of tricks that complicate the code (see the blog post for details).

All models so far are a single stack of transformer blocks (that is, no encoder/decoder structures). It turns out that this simple configuration often works best.

source - http://peterbloem.nl/blog/transformers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!