-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VM] Compress contracts before storing them and decompress on load, and bill the user only for the number of bytes in the compressed representation #2926
Comments
Yes, probably. We can have the DB store a compressed representation of the contract, and only bill the user for loading/storing the compressed representation. No need to limit ourselves to a minified representation -- we can lz4 it for example. |
Great! With such change we could push stacksgov/sips#32 forward without thinking much about how comments affects contract size and execution costs. |
I updated the issue name to reflect the change that will be carried out here. It's pretty straight-forward:
|
I think this issue sounds interesting if not to just to learn more of the codebase (I also like saving storage space). Is this still relevant? If so I'd be happy to take an (educational) stab it it... @jcnelson A few random thoughts:
|
It's definitely relevant! Compressing the clarity contract text could save ~50% of the bytes loaded. In fact, changing the on-disk representation of the Clarity code and analysis metadata could be done at any time, without a consensus-breaking change or a SIP. However, in order to pass the savings on to users (e.g. by changing the amount of block space it requires), we'd need to calculate a new cost function for contract-loads. This could be done with the voting procedure described in SIP-006, or it could be done in the next hard fork -- whichever happens sooner.
I'm not sure minification gets you anything special here? If we store the code compressed, we'd get better storage savings than minification. Also, minification won't improve execution speed nearly as well as something like byte-compiling the Clarity code. So if either of these are goals -- reduced storage and execution time -- we'd probably want to explore other tactics besides minification.
Yes, I think this could be done. Again, changing the associated cost functions will be an involved process, but the node implementation could be changed to do this without breaking anything.
No, this is neither possible nor desirable. Contracts are part of the blocks, and all nodes must store all blocks in order to ensure that the system remains resilient to unpredictable node churn and network partitions.
Yeah, we'd want to do this before picking a default compression algorithm. However, the choice of compression algorithm is only necessary once the cost of loading the contract from source is reduced to the cost of loading the compressed representation (i.e. by changing the cost function). The compression algorithm implementation would need to be deterministic and would almost certainly need to be vendored into the codebase to ensure that all nodes compress contracts to the exact same number of bytes.
This is kinda-sorta done with the analysis DB, but as you can see from the code comments, it's very coarse-grained at this time. |
I had written this before I had a better understanding how things worked - I had thought the contracts were loaded as plain-text and parsed again when pulled out, but now I see that's not the case :) So this point can be ignored.
My quick local (and unscientific) tests on both lz4 and zstd, looking only at compression efficiency, were:
|
This is something we'd like to do in the near future. @cylewitruk has graciously taken on the implementation effort. |
Assigning to @obycode for now. Please feel free to re-assign. |
Is your feature request related to a problem? Please describe.
Contracts with comments (especially with very verbose and descriptive ones) are more expensive in use than contracts that do not have any comments at all. The difference in execution can be easily explain with size difference.
Over time we will see more and more complex contracts, and to make them readable and somewhat understandable to normal user they will need more and more comments.
If developers will have to choose between contracts readability and lower execution costs, they will start choosing the second one. And as a result we will loose the most important feature of Clarity.
But what if we would store contracts in 2 versions?
First one would be stored on-chain just like it is done right now - to keep readability. While the second one could be stored on a side, and used as "executable" version - to reduce execution costs.
There is no point in loading into memory contracts with comments every single time they are called if comments plays no role in execution.
Developers could pay more for contract deployment (2x storage + additional processing), but execution should be cheaper and faster.
The text was updated successfully, but these errors were encountered: