Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using progress=True results in hung process after a previous run of sync process #121

Closed
tlvu opened this issue Apr 8, 2020 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@tlvu
Copy link
Collaborator

tlvu commented Apr 8, 2020

Description

Running the following notebook, the async process never terminate
finch_test-pavics.ipynb.txt

But restarting the Finch server (pavics-compose.sh restart finch) and clear the queue left by the previous hung process (scripts/clear-running-wps-jobs-in-db.sh finch $POSTGRES_PAVICS_USERNAME), to avoid the max connection problem, then only run the async process (do not run the sync process before) and it works. $POSTGRES_PAVICS_USERNAME is from env.local file.

Note it's restarting the WPS server (Finch) that unblock async request until the next sync request and async will be blocked again. Clearing the queue is just a nice to have since old items will eventually burst the max limit of concurrent item in the queue.

Environment

  • finch version used, if any: 0.5.2
  • Python version, if any: version in the Finch docker image
  • Operating System: Centos 7 (production deployment)

Additional Information

Might relate to #45

Edit:

  • add missing step to clear queue also
  • clarify that it's the restart that unblock async request, clearing the queue is just nice to have
tlvu added a commit to bird-house/birdhouse-deploy that referenced this issue Apr 15, 2020
Add generic_bird optional component.

Could be used as work-around for issue bird-house/finch#121 so we have a Finch dedicated for async request only.

This component is generic and configurable enough it could even accommodate any WPS.  It has its own Postgres instance so we can use it later to experiment with different version of Postgres.

Test server: https://lvupavics-lvu.pagekite.me/twitcher/ows/proxy/finchasync, https://lvupavics-lvu.pagekite.me/canarie/node/service/status

See the `README.md` update for more info.

Note the docker image, service name and port is customizable (my test server do not use the default service name).

It starts out as a Finch2 but now its rather a generic skeleton to plugin any birds we need. I wish I had this when I was testing out the new Thunderbird from PCIC.

At some point in the past, we had the idea of 2 simultaneous FlyingPigeon because the new one is so different we wanted to keep the old one around for a short while. This PR could have helped.

So do not view this PR as Finch 2 anymore, view this PR as a mechanism to quickly deploy another bird, any bird, without having to change code.

This opens up many options, testing is just one of the possible usage.
@tlvu
Copy link
Collaborator Author

tlvu commented Sep 30, 2020

Upcoming "jobqueue" feature from PyWPS (geopython/pywps#505) might help fix this.

@huard
Copy link
Collaborator

huard commented Sep 9, 2021

Running without dask resolves this problem.
There seems to be an interaction between dask scheduler/workers and the PyWPS multiprocessing async queue.
@cjauvin is investigating.

@huard huard added the bug Something isn't working label Sep 9, 2021
@huard
Copy link
Collaborator

huard commented Nov 11, 2021

Hopefully fixed by #204 #211

@huard huard closed this as completed Nov 11, 2021
@tlvu
Copy link
Collaborator Author

tlvu commented Mar 25, 2024

@eyvorchuk seems to have hit this problem again so the fix might not have worked for all cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants