Description
Description
A clear and concise description of what the bug is.
If id_list
consists of a single nonexistent––but valid––ID, arXiv returns an empty feed which is interpreted to mean "no results."
If id_list
consists of both existent and nonexistent valid IDs (["0000.0000", "1707.08567"]
), the feed is non-empty––it contains a single item––but it has feed.feed.opensearch_totalresults == 2
. The client takes this to be a partial page, and requests a page with offset 1... which lists paper 1707.08567
again. This is an API bug.
Notably, this behavior differs depending on the nonexistent ID. Nonexistent ID 1507.58567
yields an entry with missing fields (covered in #80, fixed by #82), whereas 1407.58567
yields no entries at all (covered here).
Example: https://export.arxiv.org/api/query?id_list=1407.58567,1707.08567
Steps to reproduce
Steps to reproduce the behavior; ideally, include a code snippet.
def test_invalid_id(self):
results = list(arxiv.Search(id_list=["0000.0000"]).results())
self.assertEqual(len(results), 0)
results = list(arxiv.Search(id_list=["0000.0000", "1707.08567"]).results())
print(len(results))
self.assertEqual(len(results), 1) # Fails: 1707.08567 appears twice.
Expected behavior
A clear and concise description of what you expected to happen.
Results should not be duplicated.
Searching for ["0000.0000", "1707.08567"]
should yield a single result.
Versions
python
version: 3.7.9
arxiv.py
version: 1.4.1