-
Notifications
You must be signed in to change notification settings - Fork 62
Client-side flattening of a container #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tiled/client/container.py
Outdated
@@ -1051,6 +1050,57 @@ def write_dataframe( | |||
return client | |||
|
|||
|
|||
class Composite(Container): | |||
|
|||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly cached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, nice to see this converging.
I'd like to test-drive it a bit, which I haven't yet done, but I gave a close read.
This will also need a database migration to extend the
Add content with reference to this example: https://github.com/bluesky/tiled/blob/main/tiled/catalog/migrations/versions/0b033e7fbe30_add_awkward_to_structurefamily_enum.py Then update the list of revisions in |
Here is @genematx's test script from our interactive session today, for Future Us: import pandas as pd
import numpy as np
df = pd.DataFrame({"colA": np.random.randn(10), "colB": np.random.randint(0, 10, 10), "colC": np.random.choice(["a", "b", "c", "d", "e"], 10)})
Y = c.create_composite("test", metadata={"attrs": {"b": 2}})
Y.write_array(np.random.randn(10, ), key="arr1", metadata={"attrs": {"c": 3}})
Y.write_array(np.random.randn(10, ), key="arr2", metadata={"attrs": {"d": 4}})
Y.write_array(np.random.randn(20, ), key="arr3", metadata={"attrs": {"e": 5}})
Y.write_dataframe(df, key="tab1", metadata={"attrs": {"f": 6}}) I like how this turned out: In [35]: c['test']
Out[35]: <Composite {'arr1', 'arr2', 'arr3', 'colA', 'colB', 'colC'}>
In [36]: c['test']['tab1']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[36], line 1
----> 1 c['test']['tab1']
File ~/Repos/bnl/tiled/tiled/client/container.py:1149, in Composite.__getitem__(self, key, _ignore_inlined_contents)
1147 key = self._flat_keys_mapping[key]
1148 else:
-> 1149 raise KeyError(
1150 f"Key '{key}' not found. If it refers to a table, use .parts['{key}'] instead."
1151 )
1153 return super().__getitem__(key, _ignore_inlined_contents)
KeyError: "Key 'tab1' not found. If it refers to a table, use .parts['tab1'] instead."
In [37]: c['test'].parts['tab1']
Out[37]: <DataFrameClient ['colA', 'colB', 'colC']> Perhaps we should make that error message even more generous by checking whether Soon we will need I haven't done line review yet, but conceptually I think this has landed. |
I was thinking about this too, @danielballan. I'm afraid this would force us to make another API call (since we are not currently caching |
I can think of other counterarguments, too. Like, maybe it's not in Let's leave it as is, at least for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took one more detailed read through.
|
||
|
||
def downgrade(): | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of the beta releases, we have started implementing a downgrade
path. I gather it shouldn't be too hard to do in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice! Added.
Please, doublecheck, @danielballan
A third attempt at implementing a flat-namespaced container, this time (mostly) on the client-side. While the data is stored in a new
Composite
container, the flat namespace is implemented by the python client itself: the client keeps a mapping of table column names to their full paths. The distinct structure family serves a role akin to a spec here, but its functionality is thought to be expanded in the future.Related: #668
Issue: #824
Checklist