### Feature request Paper link : https://aclanthology.org/2024.naacl-long.174.pdf Code link : https://github.com/yifanycc/loretta Motivation : Tensor-Trains are one of the promising low-rank structures heavily applied for parameter-efficient machine learning, esp. LLMs LoRETTA takes LoRa to the next level by re-parameterizing the updates for pretrained weights to set of two tensor-trains <img width="614" height="186" alt="Image" src="https://github.com/user-attachments/assets/e48c74d9-5d26-4634-bf23-290c45414512" /> <img width="1213" height="425" alt="Image" src="https://github.com/user-attachments/assets/24db0aa1-eafd-4b59-a6d9-a5e5bc0854da" /> Performance : The paper claims that loretta leads to significant parameters reduction wrt LoRa <img width="620" height="259" alt="Image" src="https://github.com/user-attachments/assets/116b1fd4-190f-4219-a15f-38a74303b569" /> Also the paper claims LoRETTA achieves comparable results to other BERT family models <img width="1279" height="661" alt="Image" src="https://github.com/user-attachments/assets/7bdaefb5-dcb8-48e5-92ac-1bf1703d1d82" /> **Note** I have contacted authors about this FR https://github.com/yifanycc/loretta/issues/7 ### Your contribution If it is OK, I would love to start a PR on that