Checkpoints not used by write_deltalake? #2555
Unanswered
VLomonovskis
asked this question in
Q&A
Replies: 1 comment 1 reply
-
I am having the exact same experience! The only way I can explain this behavior is exactly as you said that the write_deltalake reads all logs before the checkpoint aswell. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello.
I have several processes that uses the same code and append data to the same Delta Table. Those processes run in parallel. I append data using write_deltalake and use rust engine to merge schema.
As several processes add data, performance degrading and upload takes more time. As I understand it happens because increases number of transaction log files. However, when I create checkpoint ( using delta_table.checkpoint() ), it does not improve performance and looks like write_deltalake reads all the logs before checkpoint. Can this behaviour be changed?
I did see discussions about checkpoint, but they where about checkpoint creation. In my case, checkpoints not used even when created.
Beta Was this translation helpful? Give feedback.
All reactions