Skip to content

More compact representation of compiled programs #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 30, 2024

Conversation

ppedrot
Copy link
Contributor

@ppedrot ppedrot commented Jul 25, 2024

Follow-up of #248, basically made of two units.

  • Make the representation of the pruned symbol more compact by dropping useless data and representing the map as an array.
  • Reduce the size of the Flat.program type by hoisting out data that is actually not semantically part of this record.

This reduces the size of order.vo from 11087515 bytes to 10349640 bytes, i.e. another compounded 6%.

EDIT: the last commit compacts this file even more, to 10062305 bytes.

src/compiler.ml Outdated
Comment on lines 109 to 110
| ENeg of D.constant * string
| EPos of D.constant * D.term
Copy link
Contributor

@gares gares Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENEg should be something like GlobalSymbol (* < 0 *)
EPos should be BoundVariable (* >= 0 *)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprise it is not 2 arrays, IIRC the bound variable one is small and the other is contiguous, so using the abs of the value as the array position seems more natural/understandable and would remove half of the data in the buckets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, D.constant is an int, -1 ... -maxg for globals, 0 .. maxl for locals, where maxl is typically not more than 3, while maxg can be large (thousands)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for the most straightforward changes, we can try to go even further. I don't think that splitting the array will matter much though.

the other is contiguous

Is it still the case after pruning? The set of live symbols could also contain globals IIUC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I did change the constructor names for clarity.)

@ppedrot ppedrot force-pushed the even-leaner-symbol-table branch from f075a36 to f6e85a7 Compare July 25, 2024 13:35
@gares gares merged commit d7e778b into LPCIC:master Jul 30, 2024
8 of 9 checks passed
@ppedrot ppedrot deleted the even-leaner-symbol-table branch July 30, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants