The settings for max_input_length and max_decoding_tokens don't seem to be working.

I modified the config.toml file, setting **max_input_length** to 10240 and **max_decoding_tokens** to 512 (see below), but GPU memory usage remained unchanged, and the effective context length appears unaltered.  
My objective is to extend the model's context window to enhance the quality of generated text.
```
[completion]
max_input_length = 10240
max_decoding_tokens = 512

```