Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Add additional state-check before loading transaction #30

Open
chuwy opened this issue Nov 25, 2017 · 3 comments
Open

Add additional state-check before loading transaction #30

chuwy opened this issue Nov 25, 2017 · 3 comments

Comments

@chuwy
Copy link
Contributor

chuwy commented Nov 25, 2017

Right now snowflake loader grabs whole state from DynamoDB in the beginning and iterates through all not-yet-loaded folders. Between loader grabbed state and loaded all folders - another simultaneous loader can be launched which will lead to race condition. This should not be dangerous as Snowflake keeps information about loaded files to prevent double-loading them, but just in-case we would like to at least warn user that folder is already processed.

Logic:

for folder in not_yet_loaded:
  if manifest.contains(folder):
    error("Folder is already loaded skipping")
  else:
    load(folder)
    manifest.add(folder)
@alexanderdean
Copy link
Contributor

Sounds sensible!

@alexanderdean
Copy link
Contributor

This should not be dangerous as Snowflake keeps information about loaded files to prevent double-loading them

Didn't we confirm that this safety feature doesn't work as expected given how we stage data?

@chuwy
Copy link
Contributor Author

chuwy commented Dec 19, 2017

Yes, it doesn't work anymore unfortunately. We realized it exactly after I submitted this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants