You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vllm under any version mismatch with current env and if you separate eval and train. I still need a version of vllm
in separate env, using the command below, I try different engine (4o and 4-turbo) and get some numbers dose not make sense. Have you ever try different annotators when use 4o, it give me a result where DPO>SimPO, while 4-turbo gives the opposite