Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for models larger than 2GB #225

Closed
robertknight opened this issue Jun 1, 2024 · 0 comments · Fixed by #260
Closed

Support for models larger than 2GB #225

robertknight opened this issue Jun 1, 2024 · 0 comments · Fixed by #260

Comments

@robertknight
Copy link
Owner

robertknight commented Jun 1, 2024

RTen is currently limited to models that are 2GB or less in size. This limitation is inherited from the FlatBuffers format which uses 32-bit offsets internally (see google/flatbuffers#7537). Since the only dtype that is currently supported for weights is f32, this means models are limited to ~500M parameters.

This was fine for the original use case of the engine, but limits its usefulness as a general ONNX runtime for modern models. When loading ONNX models directly (#141), this issue is solved in the spec by allowing models to reference external data files. For the .rten format, a different solution will be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant