Skip to content

Latest commit

 

History

History
32 lines (21 loc) · 1.01 KB

README.md

File metadata and controls

32 lines (21 loc) · 1.01 KB

Transformer

Simple Transformer implementation from scratch in pytorch.

Classification Transformer

Generation Transformer

The original transformer: encoders and decoders

Limitations

The current models are designed to show the simplicity of transformer models and self-attention. As such they will not scale as far as the bigger transformers. For that you'll need a number of tricks that complicate the code (see the blog post for details).

All models so far are a single stack of transformer blocks (that is, no encoder/decoder structures). It turns out that this simple configuration often works best.

source - http://peterbloem.nl/blog/transformers