Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash while getting version when used alongside async frameworks #62

Open
ceball opened this issue Apr 15, 2020 · 4 comments
Open

Crash while getting version when used alongside async frameworks #62

ceball opened this issue Apr 15, 2020 · 4 comments

Comments

@ceball
Copy link
Member

ceball commented Apr 15, 2020

From @julioasotodv in holoviz/pyviz_comms#48:


In some weird environments, param's way of retrieving version information has many bugs (since it opens a subprocess to call git, for instance, which gives problems when Python processes are created with systems such as Supervisor or Gunicorn). This commit adds the option for the library to keep on going, even if the version cannot be correctly retreived.


Here are my steps to reproduce the failure: I was basically trying to use panel with the Bokeh Server, but having the Server embedded in a ASGI app (with Starlette, but that's irrelevant here), managed by Gunicorn with Uvicorn workers.

In a nutshell, imagine I create a file called server.py with the following code:

import panel as pn

And now, I want to run my "server application" with Gunicorn + Uvicorn workers (I have tried a couple of recent versions of both, so you can just conda install gunicorn uvicorn):

gunicorn -k uvicorn.workers.UvicornWorker server

The traceback for the error is:

[2020-03-05 21:47:43 +0100] [3156] [INFO] Starting gunicorn 20.0.4
[2020-03-05 21:47:43 +0100] [3156] [INFO] Listening at: http://127.0.0.1:8000 (3156)
[2020-03-05 21:47:43 +0100] [3156] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2020-03-05 21:47:43 +0100] [3159] [INFO] Booting worker with pid: 3159

[2020-03-05 21:47:45 +0100] [3159] [ERROR] Exception in worker process
Traceback (most recent call last):
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
    worker.init_process()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/uvicorn/workers.py", line 57, in init_process
    super(UvicornWorker, self).init_process()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/workers/base.py", line 119, in init_process
    self.load_wsgi()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/workers/base.py", line 144, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 49, in load
    return self.load_wsgiapp()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 39, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/util.py", line 358, in import_app
    mod = importlib.import_module(module)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/julio/Documentos/Curso Visualización Bokeh y Plotly/Ejemplos/tornado_dashboard/server.py", line 1, in <module>
    import panel as pn
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/panel/__init__.py", line 22, in <module>
    fpath=__file__, archive_commit="$Format:%h$", reponame="panel"))
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 280, in __str__
    known_stale = self._known_stale()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 223, in _known_stale
    commit = self.commit
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 133, in commit
    return self.fetch()._commit
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 162, in fetch
    self.git_fetch(cmd)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 212, in git_fetch
    self._update_from_vcs(output)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 260, in _update_from_vcs
    self._release = tuple(int(el) for el in dot_split)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 260, in <genexpr>
    self._release = tuple(int(el) for el in dot_split)
ValueError: invalid literal for int() with base 10: ''
[2020-03-05 21:47:45 +0100] [3159] [INFO] Worker exiting (pid: 3159)
[2020-03-05 21:47:45 +0100] [3156] [INFO] Shutting down: Master
[2020-03-05 21:47:45 +0100] [3156] [INFO] Reason: Worker failed to boot.

The error happens in https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L247. It turns out that because of the way either Gunicorn or Uvicorn creates new python processes, for some reason the output argument in that method is an empty string '', which makes the function fail. I believe this happens since the function git_fetch in https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L169 has a value of its internal output variable of '', as the result of the run_cmd function (which uses subprocess.Popen()).


I am doing some experiments, and it looks like the problem has to do with Gunicorn/Uvicorn returning zero for a subprocess.Popen() call that should return nonzero exit code in https://github.com/holoviz/param/blob/master/param/version.py#L23, so chances are that param.version is not technicallly wrong...


It turns out that param.version is running (or at least trying to run) some git commands in order to retrieve the version. The way the git commands are launched is through subprocess.Popen() as you can see here: https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L23

And well, it turns out that subprocess.Popen() is nor particularly async-friendly when spawned from an os fork instruction (just like Gunicorn does). Apparently, one of the problems is that it misreports the return code for the launched command.

The only solution I can think of would be to slightly modify the run_cmd function, to make it read the return code as either 0 if everything runs correctly; or nonzero if either the actual returncode read by subprocess.Popen() is nonzero or if the stderr is something other than an empty byte array.

I can create a PR in param/autover to test this. It would solve my error, but I am unsure if it is a desirable solution at all for the rest of the world... Perhaps we can see if it Travis likes it?

@ceball ceball changed the title Crash while getting version when used with gunicorn or uvicorn Crash while getting version when used alongside async frameworks Apr 15, 2020
@jbednar
Copy link
Contributor

jbednar commented Apr 15, 2020

My opinion is that autover/Param.version should never call git for a deployed package. In practice that means the presence of .version or _version would shortcut any logic to do with git, simply returning the contents of that file. One problem is that currently various things we do as developers end up leaving such a file in our development branch, which will cause our development environment to misreport the version number. I think this problem can be avoided, and I consider it buggy if autover ever calls git in a distributed package, but I'm not sure @jlstevens agrees with me entirely. @jlstevens can chime in and propose an alternative solution if he has one...

@ceball
Copy link
Member Author

ceball commented Apr 15, 2020

#58 would be the place to do that, I think.

Meanwhile I think we should at least try to resolve things like complete crashes (particularly ones that are specific, repeatable, universal) as fast as we can - seems like solving these (even in imperfect ways) should trump other considerations. I.e. I want to help get #59 finished and resolve this issue, without distractions over whether git should be run at all (which has been an open issue since mid 2018).

@zertrin
Copy link

zertrin commented Feb 5, 2021

I suffer from the exact same problem...

I've successfully manually patched version.py with the fix in PR #59, so that seems like a good mitigation, but somehow it seems that the PR has stalled not because of problems with the fix, but some other problems with tests?

Couldn't the fix be still included first (and shipped to users), and then any remaining follow-up work to fix tests be done subsequently?

@jlstevens
Copy link
Contributor

jlstevens commented Feb 5, 2021

I am confident #66 would also fix this issue once we figure out how we want the .version file to be generated (assuming you are in a situation where there is a .version file which will mean there will be no attempt to call git at all).

Also the actual change to version.py in #59 is minimal and safe so the question then is about the tests (as you say). As this is orthogonal to #66 I am inclined to simply go ahead and merge. I'll make a decision today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants