Lightning Accelerate aim to provide a simple and easy-to-use framework for training deep learning model on GPU, TPU, etc... with 🤗 Huggingface's Accelerate and⚡️Pytorch Lightning's style.
To install Lightning Accelerate, run this command:
pip install git+https://github.com/hoang1007/lightning-accelerate.git- Support training with multiple GPUs, TPUs, etc...
- Support finetuning models efficiently with LoRA
- Support several optimization techniques such as mixed precision, DeepSpeed, bitandbytes, etc...
- Support tracking experiment with Wandb and Tensorboard
To train a model, you need to define a TrainingModule and a DataModule. Here is an simple example of training a digit classifier on MNIST dataset:
# -------------------
# Step 1: Define a TrainingModule.
# This module contains the model, training and evaluation logics to easy training with `Trainer` later.
# -------------------
class MnistTrainingModule(TrainingModule):
def __init__(self):
super().__init__()
self.model = nn.Sequential(nn.Flatten(), nn.Linear(28 * 28, 10))
def training_step(self, batch, batch_idx: int, optimizer_idx: int):
x, y = batch
logits = self.model(x)
loss = nn.functional.cross_entropy(logits, y)
return loss
def get_optim_params(self):
return self.model.parameters()
# -------------------
# Step 2: Define a DataModule. This module contains the data preparation logics such as downloading data, preprocessing, etc... and then is used to feed to the `TrainingModule` for training and evaluation.
# -------------------
class MnistDataModule(DataModule):
def prepare_data(self):
# Place downloading data to avoid downloading data in every process.
train_data = MNIST("root", train=True, download=True)
val_data = MNIST("root", train=False, download=True)
def get_training_dataset(self) -> Dataset:
return MNIST(
"root",
train=True,
transform=transforms.Compose([
transforms.RandomAffine(15), transforms.ToTensor()
]),
)
def get_validation_dataset(self) -> Dataset:
return MNIST("root", train=False, transform=transforms.ToTensor())
# -------------------
# Step 3: Configure parameters with `TrainingArguments` and start training!
# -------------------
args = TrainingArguments("mnist", train_batch_size=32, num_epochs=10)
training_module = MnistTrainingModule()
data_module = MnistDataModule()
Trainer(
training_module=training_module,
training_args=args,
data_module=data_module,
).fit()You can accelerate the training process with several techniques such as mixed precision, DeepSpeed, etc... which are supported by Accelerate. For details, please refer to Accelerate's documentation.
For example, to train your models on multiple GPUs, you can run
accelerate launch --multi_gpu my_script.pyTo evaluate the pretrained model, you can use Trainer.evaluate method:
args = TrainingArguments(
"mnist",
eval_batch_size=32,
# Set `resume_from_checkpoint` to the path of the checkpoint you want to evaluate or set to `latest` to evaluate the latest checkpoint.
resume_from_checkpoint='latest'
)
training_module = MnistTrainingModule()
data_module = MnistDataModule()
# Trainer will automatically load the checkpoint and evaluate the model.
Trainer(
training_module=training_module,
training_args=args,
data_module=data_module,
).evaluate()I build the framework on top of Huggingface's Accelerate with minimum requirements while maintaining the code style as similar as possible to Pytorch Lightning 😊.
I am an inexperienced developer, so I am very happy to receive your contributions to improve the code quality and features of the framework. Please feel free to open an issue or pull request to contribute to the project 🥰.
Special thanks to Huggingface's Accelerate and Pytorch Lightning for providing the great frameworks for training deep learning models.