You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I modified the config.toml file, setting max_input_length to 10240 and max_decoding_tokens to 512 (see below), but GPU memory usage remained unchanged, and the effective context length appears unaltered.
My objective is to extend the model's context window to enhance the quality of generated text.