Support for models larger than 2GB #225

robertknight · 2024-06-01T18:08:43Z

RTen is currently limited to models that are 2GB or less in size. This limitation is inherited from the FlatBuffers format which uses 32-bit offsets internally (see google/flatbuffers#7537). Since the only dtype that is currently supported for weights is f32, this means models are limited to ~500M parameters.

This was fine for the original use case of the engine, but limits its usefulness as a general ONNX runtime for modern models. When loading ONNX models directly (#141), this issue is solved in the spec by allowing models to reference external data files. For the .rten format, a different solution will be needed.

The text was updated successfully, but these errors were encountered:

robertknight mentioned this issue Jul 1, 2024

Introduce version 2 .rten model format, with support for larger (> 2GB) models #260

Merged

robertknight closed this as completed in #260 Jul 3, 2024

robertknight mentioned this issue Jul 4, 2024

Make the V2 model format the default #267

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for models larger than 2GB #225

Support for models larger than 2GB #225

robertknight commented Jun 1, 2024 •

edited

Loading

Support for models larger than 2GB #225

Support for models larger than 2GB #225

Comments

robertknight commented Jun 1, 2024 • edited Loading

robertknight commented Jun 1, 2024 •

edited

Loading