-
Notifications
You must be signed in to change notification settings - Fork 10
videos: transform doi and collections #262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cds_migrator_kit/videos/weblecture_migration/transform/xml_processing/rules/base.py
Show resolved
Hide resolved
@for_each_value | ||
def collection_tags(self, key, value): | ||
"""Translates collection_tags.""" | ||
collection_mapping = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have also this collection and records are identified there with the following query: 490:'CERN Accelerator School' and 690C:'TALK'
. For these records, we will need to add also the extra tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'll check the 490:'CERN Accelerator School' and 690C:'TALK' and add the tag for this collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
490
is Series and it doesn’t have a rule yet, I'm adding this to my TODO, and add it later with 490
is it okay? And do we need to check both 490
and 690
or only 490
is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm because our search query gives back only video content, having only the 490
check should be enough. You can verify that you get all the records of the above collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ntarocco any comment on how to store the collection tags? Do you think it would be better to keep somewhat the tree structure e.g tags: ['Lectures/Academic training lectures']
or a flat representation is enough e.g tags: ['Lectures', 'Academic training lectures']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zzacharo I checked the collection and 350/396 record is in our search query, 20 of them doesn’t have video files. I don’t know the rest 26 records😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the 20 missing, maybe ask Rene to see if they can retrieve the video, otherwise we will need to import them as metadata only or in a format that we could potentially in the future edit them. For the missing 26 records, we need to understand why...
if doi.startswith("10.17181"): | ||
return doi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just use the same DOI value and this DOI is keeping the cds record. It's redirecting to cds record, and after migration cds record will redirect to videos. Is this okay or do we need to update datacite? @zzacharo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to gather all the related records and update datacite in fact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not missing we have only one record with 10.17181
doi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we have more with a different DOI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I guess those records don’t have videos
search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the list down to see what we will do with these records, maybe add an issue or worth actually sending the list to Jens. Btw, https://cds.cern.ch/record/276197 this one also has a link to CERN library catalogue so if we migrate them we need to keep the link!
d8052cd
to
d832319
Compare
"TP": "Lectures,Talks Seminars and Other Events,Teacher Programmes", | ||
"e-learning": "Lectures,E-learning modules", | ||
"E-LEARNING": "Lectures,E-learning modules", | ||
"Restricted_ATLAS_Talks": "Lectures,Restricted ATLAS Talks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking that this can be just
"Restricted_ATLAS_Talks": "Lectures,Restricted ATLAS Talks", | |
"Restricted_ATLAS_Talks": "Lectures, ATLAS Talks", |
with just the proper restrisctions but let's leave it like this for now and discuss it when see the collection tree
dbb8965
to
5dcb91b
Compare
5dcb91b
to
9fe5632
Compare
closes CERNDocumentServer/cds-videos#2041
closes CERNDocumentServer/cds-videos#2042
New rules
0247
(DOI) transformed as DOI if it's starting with10.17.181
, or alternate_identifiers withDOI
scheme.980
collections transformed ascollections
964
transformed as_curation.964
with marc tags (964__a:....)853
transformed as_curation.853
with marc tags (853__a:....)336
transformed as_curation.336
with marc tags (336__a:....)Improvements
269
(imprint)b
name of publication transformed as contributor with roleProducer
041
(language) if we have multiple languages, first one used as a main language and others added asadditional_languages