gpt2-tensorflow-to-pytorch-converter

This repository contains a quick-to-use script to convert GPT-2 models from TensorFlow to PyTorch model format.

Usage

Collect all your TensorFlow model files into a singular directory, i.e. these files:

model-<number>.meta
vocab.bpe
model-<number>.data-00000-of-00001
model-<number>.index
checkpoint
counter
encoder.json
hparams.json

Clone the repo, install prerequisites with i.e. pip install -r requirements.txt if needed.

Run the script:

python convert_model.py /path/to/your/model/files

The converted PyTorch model will be saved in the ./converted_model directory.

Notes

Have fun, I probably won't be updating this one much.

License

This project is licensed under the MIT License.

Contribute

All code improvements are welcome. This should at least work on all TF1.x-based GPT-2 architecture models.

About

Flying from the mind of FlyingFathead
Digital ghost code by ChaosWhisperer

Name	Name	Last commit message	Last commit date
Latest commit FlyingFathead token tests Jul 23, 2024 d2cfa9b · Jul 23, 2024 History 15 Commits
.gitignore	.gitignore	Cleaning up, .gitignore etc	Jul 23, 2024
README.md	README.md	README	Jul 23, 2024
add_missing_token.py	add_missing_token.py	utils for merge verification	Jul 23, 2024
add_missing_tokens.py	add_missing_tokens.py	utils for merge verification	Jul 23, 2024
check_for_problematic_token.py	check_for_problematic_token.py	utils for merge verification	Jul 23, 2024
check_model.py	check_model.py	Initial commit	Jul 23, 2024
check_token_in_merges.py	check_token_in_merges.py	utils for merge verification	Jul 23, 2024
check_vocab_coverage.py	check_vocab_coverage.py	utils for merge verification	Jul 23, 2024
clean_and_sync_vocab.py	clean_and_sync_vocab.py	utils for merge verification	Jul 23, 2024
consolidated_checkup.py	consolidated_checkup.py	utils for merge verification	Jul 23, 2024
convert_model.py	convert_model.py	utils for merge verification	Jul 23, 2024
fix_and_test_tokenizer.py	fix_and_test_tokenizer.py	token tests	Jul 23, 2024
fix_encoding_and_sync_files.py	fix_encoding_and_sync_files.py	token tests	Jul 23, 2024
inspect_tokenizer_files.py	inspect_tokenizer_files.py	utils for merge verification	Jul 23, 2024
merge_vocabularies.py	merge_vocabularies.py	token tests	Jul 23, 2024
print_vocab.py	print_vocab.py	utils for merge verification	Jul 23, 2024
requirements.txt	requirements.txt	Cleaning up, .gitignore etc	Jul 23, 2024
simplified_tokenizer_test.py	simplified_tokenizer_test.py	tests	Jul 23, 2024
sync_tokens.py	sync_tokens.py	utils for merge verification	Jul 23, 2024
sync_vocab_merges.py	sync_vocab_merges.py	utils for merge verification	Jul 23, 2024
test_pytorch_model.py	test_pytorch_model.py	token tests	Jul 23, 2024
test_tokenizer.py	test_tokenizer.py	token tests	Jul 23, 2024
update_vocab_with_missing_tokens.py	update_vocab_with_missing_tokens.py	utils for merge verification	Jul 23, 2024
validate_vocab_and_tokens.py	validate_vocab_and_tokens.py	token tests	Jul 23, 2024
verify_cleaned_merges.py	verify_cleaned_merges.py	utils for merge verification	Jul 23, 2024
verify_encoding.py	verify_encoding.py	utils for merge verification	Jul 23, 2024
verify_vocab.py	verify_vocab.py	utils for merge verification	Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpt2-tensorflow-to-pytorch-converter

Usage

Notes

License

Contribute

About

About

Releases

Packages

Languages

FlyingFathead/gpt2-tensorflow-to-pytorch-converter

Folders and files

Latest commit

History

Repository files navigation

gpt2-tensorflow-to-pytorch-converter

Usage

Notes

License

Contribute

About

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages