Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify Max Context Length for Llama 3 Queries in Payload Configuration #7

Open
gileneusz opened this issue Jun 23, 2024 · 1 comment

Comments

@gileneusz
Copy link

gileneusz commented Jun 23, 2024

this is only for Ollama queries

When using the default context length settings, each query to the Llama model is limited to approximately 2,000 tokens. To effectively utilize the full potential of the Llama 3 model, especially for more complex tasks requiring a larger context, it is necessary to explicitly set the maximum context length.

Below is an example payload configuration that sets the maximum context length to 8,192 tokens, which is not currently the default behavior:

payload = {
    "model": self.model,
    "prompt": user,
    "system": system,
    "stream": False,
    "temperature": 0,
    "options": {"num_ctx": 8192}
}

Expected Behavior:
Allow the context length to be configurable up to the maximum supported by the model directly through the payload options.

Actual Behavior:
The context length defaults to around 2,000 tokens, which may not suffice for more in-depth analyses or larger data contexts required by users.

Suggested Fix:
Include an option within the model configuration to easily specify and adjust the maximum context length according to user needs or specific tasks.

@john-adeojo
Copy link
Owner

Thanks for raising this, I'll look to add it if I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants