eeLLaMA

Implementation of Meta AI's LLaMA model with modifications for early-exiting output networks. The ultimate goal is to deploy a head-model on some edge device for quick inference. This particular implementation will be optimized to run on MPS. In future iterations where tail models are designed to run in the cloud should be modified for CUDA support.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
eellama		eellama
README.md		README.md
example.py		example.py
split_model.py		split_model.py
train_output_network.py		train_output_network.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

eeLLaMA

About

Releases

Packages

Languages

chaseklvk/eellama

Folders and files

Latest commit

History

Repository files navigation

eeLLaMA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages