Question: How to use Shared LLM embeddings in stage 1 for alignment?

Since u only use CTC-loss, is there any detail about how to convert the audio features into the representation space of LLMs with a Transformer layer?
How to use existing LLM embeddings here?

Thank you!