GPU out of memory when running example code

Hi.
When running `infer_llm_base.py` in `InternLM-XComposer/InternLM-XComposer-2.5-OmniLive/example`, my 40GB A100 went out of memory.
I'm using the downloaded model locally(downloaded from hugging face`.
Is that a problem with the model or the code?
PS: Please provided more detailed guide for running inference of InternLM-XComposer-2.5-OmniLive. Thanks!