Releases: helpmefindaname/transformer-smaller-training-vocab
Releases · helpmefindaname/transformer-smaller-training-vocab
0.2.1
What's Changed
- Fix saving of reduced models: When the models are saved while being reduced, they now properly set the vocab-size in the config, allowing the used to load the model again, no matter if the context manager is still there or not. @helpmefindaname in #4
- If the embeddings are frozen (not trainable), the reduced embeddings will also be frozen @helpmefindaname in #4
Full Changelog: 0.2.0...0.2.1
0.2.0
What's Changed
- Update Optimizer Parameters by @helpmefindaname in #3
Introduces an optional parameter optimizer
to reduce_train_vocab
. Which can be used to modify the parameter group to exchange changed embeddings to the new ones. An example usage is the following:
model = ...
tokenizer = ...
optimizer = Adam(model.parameters(), lr=...)
...
with reduce_train_vocab(model=model, tokenizer=tokenizer, texts=get_texts_from_dataset(raw_datasets, key="text"), optimizer=optimizer):
train_with_optimizer(model, tokenizer, optimizer)
save_model() # save model at the end to contain the full vocab again.
Full Changelog: 0.1.8...0.2.0
0.1.8
What's Changed
- Lower dependency requirements to transformers 4.1, torch 1.8 and datasets 2.0 as the package was previously too restrictive
Full Changelog: 0.1.7...0.1.8
0.1.7
Full Changelog: 0.1.6...0.1.7
0.1.0
Initial setup, hello world!
So far support for FastTokenizers, BertTokenizer, RobertaTokenizer and XLMRobertaTokenizer is added.