-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Describe the bug
We have noticed intermittant netcdf read errors when accessing opendap links in the PAVICS jupyterhub.
Approximate date when the problem began ~ March 01 2023
To Reproduce
The intermittant nature of this problem makes reproducing somewhat difficult but a public notebook on the PAVICS server is available here : https://pavics.ouranos.ca/jupyter/hub/user-redirect/lab/tree/public/logan-public/Tests/THREDDS_Issues_March2023/Random_Thredds_read_errors.ipynb
The notebook will execute a relatively large workflow and uses multiple dask worker processes to accentuate the possibility of a read error
Multiple notebook runs (note only ~5-6 due to time needed) have shown that bypassing twitcher (i.e. thredds with nginx proxy only) always allows successful completion of the calculations whereas accessing opendap links behind the nginx/twitcher combinations typically results in a read error relatively quickly in the workflow.
Although not quantified there also seems to be a general performance hit when accessing data via nginx/twitcher (print outs of execution times in the workflow loop between 25-40 sec with nginx/twitcher versus 18-30 seconds with nginx only). Note also, that the notebook runs 'nginx-only' code first so I do not believe the performance difference is benefiting from caching of data or if it is should benefit the 'twitcher/nginx' run.
Expected behavior
Execution of code without read error
- OS: PAVICS jupyterlab (linux)