Remove block size limitation #52

jonb377 · 2023-12-19T19:54:31Z

To train with sequence lengths longer than 2048, we need to bypass the limitation enforced in run_clm, which is likely a safeguard for finetuning.

Long-term, it would also make sense to retrain a tokenizer with a larger sequence length, but this will unblock early tests with the default tokenizer.

suexu1025

Thanks for the update!

jonb377 · 2023-12-19T21:38:57Z

Thanks @suexu1025! Are you OK to work off of this branch? I'd prefer not to merge this into the llama2-google-next-training branch since we can train a new tokenizer to work around the existing limitation.

Remove block size limitation

4b07f15

suexu1025 approved these changes Dec 19, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove block size limitation #52

Remove block size limitation #52

Uh oh!

jonb377 commented Dec 19, 2023

Uh oh!

suexu1025 left a comment

Uh oh!

jonb377 commented Dec 19, 2023

Uh oh!

Uh oh!

Remove block size limitation #52

Are you sure you want to change the base?

Remove block size limitation #52

Uh oh!

Conversation

jonb377 commented Dec 19, 2023

Uh oh!

suexu1025 left a comment

Choose a reason for hiding this comment

Uh oh!

jonb377 commented Dec 19, 2023

Uh oh!

Uh oh!