Open
Description
Describe the bug
When trying to fill a Publication that has an unexisting publication date (for example: '2023/4/31'), it fails with a Value Error.
To Reproduce
search_query = scholarly.search_pubs('Where Technology and Content Fuse: Applying Technology Acceptance to the Usage of and Payment for Digital Journalism')
scholarly.pprint(next(search_query))
Expected behavior
Since only the year of the publication is stored as ['bib']['pub_year'] I would expect to ignore errors with Months or Days and just fallback to the year or leave ['bib']['pub_year'] as NA.
Desktop (please complete the following information):
- Proxy service: None
- python version: 3.11
- OS: macOS 15.2
- Scholarly Version: 1.7.11 / Latest
Do you plan on contributing?
- Yes, I will create a Pull Request with the bugfix.
My suggestion would be to adapt PublicationParser.fill with:
elif key == 'publication date':
patterns = ['YYYY/M',
'YYYY/MM/DD',
'YYYY',
'YYYY/M/DD',
'YYYY/M/D',
'YYYY/MM/D']
try:
publication['bib']['pub_year'] = arrow.get(val.text, patterns).year
except ValueError:
# fallback to regex year extraction
publication['bib']['pub_year'] = re.search(r'\d{4}', val.text).group()