Describe the workflow you want to enable
Hi! I noticed that mate-batch-size must be 1 in the finetuning example, and the batch-size dimension keeps 1 through the pytorch model. Can we make real "batch" predictions using different train datasets?
Describe your proposed solution
The model architecture seems to support this (it has a batch-size dim). I'm not sure why this is not open yet.
Describe alternatives you've considered, if relevant
No response
Additional context
No response
Impact
None