Check point a prompt state, is it possible? #671

delock · 2023-04-01T05:46:52Z

delock
Apr 1, 2023

I noticed that applictions based on llama uses different long prompts to pre-condition the model. With 7B and 13B model weights the model usually takes a while to read the prompt until process user inputs. So there is a waiting time before the first response. For example, when I use chat-13B.sh, it takes about 1 mintue before the first response from my input, yet response time after that is fairly fast.

If I understand correctly the long prompt can put the model in certain internal state, and these internal state make the model process user input as the way user expected. After the model reach this internal state, most of the original prompt does not need to be read and process again.

Is it theortical possible to store this internal state on a disk file, then read it back in a new session, instead of process the initial prompt text again? This could save a lot of initial waiting time, as long as the energy used to recompute the internal state.

I know there might be a lot of engineering implications, but just want to know whether this is a feasible idea or there are other blocking things.

ggerganov · 2023-04-02T08:31:02Z

ggerganov
Apr 2, 2023
Maintainer

This will be implemented eventually: #64

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check point a prompt state, is it possible? #671

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Check point a prompt state, is it possible? #671

delock Apr 1, 2023

Replies: 1 comment

ggerganov Apr 2, 2023 Maintainer

delock
Apr 1, 2023

ggerganov
Apr 2, 2023
Maintainer