Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(gthread): Use request timeout / 2 for epoll timeout #3319

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ankush
Copy link

@ankush ankush commented Oct 29, 2024

gthread calls epoll_wait (and 2 other syscalls) every second because it
specifies timeout to be 1 second.

λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1`
strace: Process 30815 attached
epoll_wait(7, [], 1, 666)               = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0
epoll_wait(7, ^Cstrace: Process 30815 detached
 <detached ...>

Timing out every second wakes up the process and loads it on CPU even
if there is nothing to service.

This can be detrimental when you have total workers >> total cores and
a multi-tenant setup where most tenants might be sitting idle. (but not
"idle enough" because of 1s polling timeout)

This can possibly keep a completed future in the queue until the next
request arrives, but I don't see any obvious problem with it except a few
bytes of extra memory usage? I could be wrong here, please check this.

fixes #3317 (more details on issue)

gthread calls epoll_wait (and 2 other syscalls) every second because it
specifies timeout to be 1 second.

```
λ sudo strace -p `pgrep -f "gunicorn: worker" | head -n1`
strace: Process 30815 attached
epoll_wait(7, [], 1, 666)               = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */, {tv_sec=3157, tv_nsec=198136276} /* 1970-01-01T06:22:37.198136276+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */, {tv_sec=3158, tv_nsec=204192934} /* 1970-01-01T06:22:38.204192934+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */, {tv_sec=3159, tv_nsec=210145196} /* 1970-01-01T06:22:39.210145196+0530 */], 0) = 0
epoll_wait(7, [], 1, 1000)              = 0
getppid()                               = 30800
utimensat(6, NULL, [{tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */, {tv_sec=3160, tv_nsec=215517372} /* 1970-01-01T06:22:40.215517372+0530 */], 0) = 0
epoll_wait(7, ^Cstrace: Process 30815 detached
 <detached ...>
 ```

Timing out every second wakes up the process and loads it on CPU even
if there is nothing to service.

This can be detrimental when you have total workers >> total cores and
multi-tenant setup where certain tenants might be sitting idle. (but not
"idle enough" because of 1s polling timeout)

This can possibly keep a completed future in queue for a small while,
but I don't see any obious problem with it except few bytes of extra
memory usage. I could be wrong here.

fixes benoitc#3317
@pajod
Copy link
Contributor

pajod commented Oct 29, 2024

There is a mismatch between the title and content. I suspect you meant to apply some fraction of the timeout.

I wonder if TTFB for a burst of request after long idle is meaningfully impacted by cleaning up keep-alive sockets all at once. Either also consider the potentially lower poll time for that, or document the changed --keep-alive behavior.

@ankush
Copy link
Author

ankush commented Oct 30, 2024

@pajod timeout is divided by 2 when the worker is created. So self.timeout is half of request timeout.

worker = self.worker_class(self.worker_age, self.pid, self.LISTENERS,
self.app, self.timeout / 2.0,
self.cfg, self.log)

I'll look into 2nd part of your comment in some time.

@ankush
Copy link
Author

ankush commented Oct 30, 2024

@pajod I think best course of action is to keep timeout as min(req_timeout/2, keep_alive) when there are kept alive sockets, this addresses your concern and mine.

  1. I can set keep-alive to a much higher value. The setup I have like most others, runs gunicorn behind a reverse proxy.
  2. TTFB has minimal or no effect from this change.
  3. keep-alive keeps working as advertised... with a worst-case deviation of up to 2x keep-alive value only occurring when workers go idle. We can document this. E.g. a socket had almost reached keep-alive timeout but we just blocked on epoll_wait. So, nearly 2x keep-alive time will be required in idle conditions.

When worker goes idle, two problems can occur:

1. keepalived connections might stay open for request_timeout / 2. That
   can be significantly longer than keep-alive timeout.
2. When next request comes up after idle period, all keepalived sockets
   must be cleaned up first before we can start responding to requests.
   This will increase TTFB.

After this change:

1. keepalived sockets are cleaned up with frequency of *at least*
   keepalive timeout.
2. TTFB is minimally affected.
3. Worst case: worker goes idle with keepalived connection that was
   almost hitting timeout . That connection will be kept alive for ~2x
   keepalive timeout.
@@ -206,8 +206,13 @@ def run(self):

# can we accept more connections?
if self.nr_conns < self.worker_connections:
# Block for new events until worker times out or keepalive timeout.
select_timeout = self.timeout or 1.0
if self._keep:
Copy link
Author

@ankush ankush Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can hoist this change out of the while loop, but then folks running the default config (keepalive=2) won't see any noticeable difference.

Keeping it inside the loop allows making timeout adaptive to the presence of keepalived sockets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possibly unnecessary polling in gthread
2 participants