Skip to content

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Notifications You must be signed in to change notification settings

littlepan0413/mixture-of-experts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Sparsely Gated Mixture of Experts Layer for PyTorch

source: https://techburst.io/outrageously-large-neural-network-gated-mixture-of-experts-billions-of-parameter-same-d3e901f2fe05

This repository contains the PyTorch re-implementation of the MoE layer described in the paper Outrageously Large Neural Networks for PyTorch.

Requirements

This example was tested using torch v1.0.0 and Python v3.6.1 on CPU.

To install the requirements run:

pip install -r requirements.py

Example

The file example.py contains an example illustrating how to train and evaluate the MoE layer with dummy inputs and targets. To run the example:

python example.py

Acknowledgements

The code is based on the TensorFlow implementation that can be found here.

About

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%