-
Notifications
You must be signed in to change notification settings - Fork 1
Iceberg materialization test #552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@ian-r-rose Thank you Ian! One question here is: does this json file stores all iceberg tables' metadata information? And we can use such one json file to find all most updated metadata information by choosing variables for the table name and schema right. |
Yes, exactly! I'm attaching an example file that was created as part of this PR (which is also hosted in S3, as you can see in the above script): |
Thanks for clarification! I will follow your steps and create necessary iceberg tables for your review. |
ed46b84
to
33ef1e1
Compare
@thehanggit I'm reopening this with some additional experimentation around iceberg metadata. The main issue is that there can be many different versions of the iceberg metadata (in general, each change to the table results in a new version of the metadata). Snowflake keeps track of which version is the most recent, but it's not always easy to determine the correct one from a different tool trying to query the iceberg table.
This creates a new stored procedure that unloads a JSON blob to the
iceberg
folder in S3 that lists the correct metadata file for each iceberg table in the marts database. I was then able to use that directly in the following sample duckdb script: