ov-genai version: 25.1-25.4.2
Model: phi-4-mini
PC: PTL new platform device
Reproduction Steps:
1.Launch the application and perform initial inference.
2.Open Task Manager → Performance → GPU, and monitor the GPU memory usage.
3.Remain idle for 2 to 20 minutes while observing GPU memory. Perform inference again after the GPU memory usage drops significantly.
4.The issue occurs probabilistically under the above conditions. It appears occasionally in the chat sample, but consistently reproduces when using WinUI3.
This issue can only be reproduced on PTL devices, not on LNL devices.
Observation:
It appears that LNL devices do not automatically release GPU memory, whereas PTL devices exhibit automatic GPU memory release behavior.