-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
I found that, in the code, if hidden_states.shape[1] != 1: is used to detect prefill stage.
I thought this criteria sometimes goes wrong. In my experiments, the hidden_states.shape was:
- hidden_states.shape[1] = n: prefill
- hidden_states.shape[1] =1: when generating the 2nd token
- hidden_states.shape[1] =2: when generating the 3rd token
- hidden_states.shape[1] =3: when generating the 4th token
...
So I have two questions:
- Can hidden_states.shape[1] consistently equals to 1 (instead of 1,2,3,...) after prefilling if I use some generating setting?
- Would
position_ids[0,0] == 0be a robuster criteria thanhidden_states.shape[1] != 1to detect prefill?
Metadata
Metadata
Assignees
Labels
No labels