Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved the creation of the sorted vocabulary in "build_tokenizer()" #328

Closed
wants to merge 5 commits into from

Conversation

rdentato
Copy link
Contributor

Currently, the vocabulary is sorted every time the encode() function is called.
For the current version, it's not an issue since the program exits every time but if the generation code would be called within a loop to have a "chat" version, this qsort will be useless (and detrimental).

I've moved this to the build_tokenizer() function as it seems to me that this is an operation that should logically be performed at the time when we set up the tokenizer.

@karpathy
Copy link
Owner

Slight downside here is that sorting the vocab probably creates latency, and if the user isn't going to prompt the model we are paying the price needlessly. Maybe we can allocate it lazily?

@rdentato
Copy link
Contributor Author

Yes, it can be done in "encode()" to ensure it is done only when needed. I'll change it.

@rdentato
Copy link
Contributor Author

Done. Frankly, I don't think there will be many using llama2.c with no prompt, but I may be wrong.

@karpathy
Copy link
Owner

Just pushed a very similar commit ty

@karpathy karpathy closed this Aug 22, 2023
@rdentato rdentato deleted the patch-less-qsort-in-encode branch August 22, 2023 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants