-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move datasets between 4CAT servers #375
Conversation
…CAT version for comparison, run some tests, I had no idea where you left off, so I just wanted to test it and see that way. Made some fixes and now see that you worked up until the log and actual data.
this uses the commit, which makes sense at the moment, but perhaps not in the long term.
…ed up some errors/logging and added notes
Tested between two instances of 4CAT successfully! A few notes to consider before a merge:
|
# Conflicts: # datasources/fourcat_import/import_4cat.py
When we merge this, it is going to make testing on problem datasets so much easier! |
# Conflicts: # webtool/templates/controlpanel/user-bulk.html
# Conflicts: # datasources/fourcat_import/import_4cat.py
This now seems to work, with some limitations and caveats:
It would be quite easy to allow people to import multiple datasets by providing a list of URLs instead of a single URL. The code is already set up for this. However, it may not be intuitive that the front-end acts as if you're creating a single dataset instead of all the datasets you're trying to import. The 'Create dataset' page is currently set up to create one and only one dataset. So the back-end is set up for larger imports but there are some UI problems to solve to make it possible. Perhaps this could be a 'power user' option that would need to be enabled by admins (though we arguably already have too many of such options). |
…hon3.9 and newer)
Fixes #352. Work in progress. Basic architecture:
API endpoint that returns one of four components of a dataset:
Worker that takes a list of dataset URLs and a 4CAT API key and, using these endpoints, 'reconstructs' the dataset locally, and queues additional jobs for the child datasets (if there are any)
Datasets can only be moved between 4CAT servers of the same version. This precludes one use case - moving from an older to a newer 4CAT - but the alternative is a recipe for complications, because the database structure can change between versions.