Add freeze/thaw and pickle capabilities #40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NOTE: this builds off of already existing MR #39 which inlines prefix storage within a node and simplifies these updates. The change of this MR alone are much more localized.
This MR adds the ability freeze/thaw as well as pickle pyt objects. I tend to have rather large static pyt objects and ingesting fresh from sources at startup takes several minutes. If this data is stored in pickled format however startup can be nearly instantaneous.
The new
freeze()
method changes the underlying memory representation for the pyt object. Rather than having a node graph where each node is individually dynamically allocated, this will allocate one contiguous chunk of memory and store all elements within this chunk. When doing so it re-writes the linkages to be self consistent in the new location. The actual data pointers within each node remain unchanged as they are themselves dynamically allocated python objects. This reorganization has the end effect of making the pickle function easier since now one can bulk memcpy the entire nodes array and at restore can rewrite pointer linkage based on the memory address of the first node. It also provides a more compact representation of the structure allowing for better memory cache performance. The state of being frozen is stored in a flag within top level tree type. This is important because in this frozen/compact representation we'll disallow any modification to the pyt contents (insert/delete/etc.).The new
thaw()
method does the exact reverse function of thefreeze()
method and will restore the per-node dynamic allocation. In this form things are less compact, but can be modified much more efficiently.I also implemented the
__reduce__
method (for pickle support) and the__setstate__
method (for unpickle). If pickling is attempted on a non-frozen object the user is given warning that they must thaw first. When an object is unpickled it will be restored in the frozen state. If a user intents to modify they can invokethaw()
and then do so.While one could automatically freeze/thaw during the pickle process I've intentionally chosen to maintain these functions separately. In my primary use case I would never have need to
thaw()
and thus forcing it to be so creates extra computation, and loses memory cache benefit. I suspect use of pickling of these objects is not a common use case thus a more advanced user can invoke these extra methods as needed.Several tests were added to verify proper operation and the
README.md
was also updated to document and demonstrate the new capabilities.