Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: driver.quit(): python3 script takes two minutes to terminate #15182

Open
PSandro opened this issue Jan 29, 2025 · 5 comments · May be fixed by #15183
Open

[🐛 Bug]: driver.quit(): python3 script takes two minutes to terminate #15182

PSandro opened this issue Jan 29, 2025 · 5 comments · May be fixed by #15183

Comments

@PSandro
Copy link

PSandro commented Jan 29, 2025

What happened?

When I use driver.quit() in the code example below, the python3 process doesn't terminate immediately after the last line of the script. Instead, two minutes need to go by before we get termination.
I assume the garbage collector calls service.stop() again at the end of the script, even though the service process is already terminated explicitly by me calling the driver.quit(). This results in the method call self.send_remote_shutdown_command()` even though the service process has already terminated.

I suspect this somehow leaves some file descriptor or similar open, when trying to call the shutdown endpoint from the service url.

request.urlopen(f"{self.service_url}/shutdown")

When you add a timeout (e.g. request.urlopen(f"{self.service_url}/shutdown", timeout=1)), then the python3 script is able to terminate immediately. To my understanding, we shouldn't even try to send the remote shutdown command, since the service process is already terminated.

if self.process is not None:

inside service.py#stop(self), a check is done to see if the process is not None. But even if the process is not None, it might still be already terminated.

The bug doesn't appear when doing an additional check in the webdriver/common/service.py to only send to shutdown command if the process has actually not terminated yet by checking if there's no return code yet:

        if self.process is not None and self.process.poll() is None:
            try:
                self.send_remote_shutdown_command()
[...]

How can we reproduce the issue?

requirements: `selenium==4.28.1`

and the following script (`main.py`)

import logging
from selenium import webdriver
logging.basicConfig(level=logging.DEBUG)

driver = webdriver.Firefox()
driver.get("http://www.python.org")
driver.quit()
print('finish')


then run the script (and optionally time it):
`time python main.py`

Relevant log output

(.venv) [sandro@x13 ~/Downloads/selenium-bug-reproduce]$ time python3 main.py                                                              0
DEBUG:selenium.webdriver.common.selenium_manager:Selenium Manager binary found at: /home/sandro/Downloads/selenium-bug-reproduce/.venv/lib/python3.13/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG:selenium.webdriver.common.selenium_manager:Executing process: /home/sandro/Downloads/selenium-bug-reproduce/.venv/lib/python3.13/site-packages/selenium/webdriver/common/linux/selenium-manager --browser firefox --debug --language-binding python --output json
DEBUG:selenium.webdriver.common.selenium_manager:Found geckodriver 0.35.0 in PATH: /sbin/geckodriver
DEBUG:selenium.webdriver.common.selenium_manager:firefox detected at /usr/bin/firefox
DEBUG:selenium.webdriver.common.selenium_manager:Running command: /usr/bin/firefox -v
DEBUG:selenium.webdriver.common.selenium_manager:Output: "Mozilla Firefox 134.0"
DEBUG:selenium.webdriver.common.selenium_manager:Detected browser: firefox 134.0
DEBUG:selenium.webdriver.common.selenium_manager:Required driver: geckodriver 0.35.0
DEBUG:selenium.webdriver.common.selenium_manager:Driver path: /sbin/geckodriver
DEBUG:selenium.webdriver.common.selenium_manager:Browser path: /usr/bin/firefox
DEBUG:selenium.webdriver.common.service:Started executable: `/sbin/geckodriver` in a child process with pid: 95813 using 0 to output -3
DEBUG:selenium.webdriver.remote.remote_connection:POST http://localhost:34131/session {'capabilities': {'firstMatch': [{}], 'alwaysMatch': {'browserName': 'firefox', 'acceptInsecureCerts': True, 'moz:debuggerAddress': True, 'pageLoadStrategy': <PageLoadStrategy.normal: 'normal'>, 'browserVersion': None, 'moz:firefoxOptions': {'binary': '/usr/bin/firefox', 'prefs': {'remote.active-protocols': 3}}}}}
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:34131
DEBUG:urllib3.connectionpool:http://localhost:34131 "POST /session HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":{"sessionId":"52486506-ca73-4c4d-b01c-ad7e8f59eee0","capabilities":{"acceptInsecureCerts":true,"browserName":"firefox","browserVersion":"134.0","moz:accessibilityChecks":false,"moz:buildID":"20250106205921","moz:debuggerAddress":"127.0.0.1:46979","moz:geckodriverVersion":"0.35.0","moz:headless":false,"moz:platformVersion":"6.12.9-arch1-1","moz:processID":95820,"moz:profile":"/tmp/rust_mozprofileQC76Mz","moz:shutdownTimeout":60000,"moz:webdriverClick":true,"moz:windowless":false,"pageLoadStrategy":"normal","platformName":"linux","proxy":{},"setWindowRect":true,"strictFileInteractability":false,"timeouts":{"implicit":0,"pageLoad":300000,"script":30000},"unhandledPromptBehavior":"dismiss and notify","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0"}}} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '802', 'date': 'Wed, 29 Jan 2025 15:25:31 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
DEBUG:selenium.webdriver.remote.remote_connection:POST http://localhost:34131/session/52486506-ca73-4c4d-b01c-ad7e8f59eee0/url {'url': 'http://www.python.org'}
DEBUG:urllib3.connectionpool:http://localhost:34131 "POST /session/52486506-ca73-4c4d-b01c-ad7e8f59eee0/url HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":null} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '14', 'date': 'Wed, 29 Jan 2025 15:25:33 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
DEBUG:selenium.webdriver.remote.remote_connection:DELETE http://localhost:34131/session/52486506-ca73-4c4d-b01c-ad7e8f59eee0 {}
DEBUG:urllib3.connectionpool:http://localhost:34131 "DELETE /session/52486506-ca73-4c4d-b01c-ad7e8f59eee0 HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":null} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '14', 'date': 'Wed, 29 Jan 2025 15:25:33 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
finish
python3 main.py  4.40s user 1.64s system 4% cpu 2:18.79 total

Operating System

ArchLinux

Selenium version

python 4.28.1

What are the browser(s) and version(s) where you see this issue?

Mozilla Firefox 134.0

What are the browser driver(s) and version(s) where you see this issue?

0.35.0

Are you using Selenium Grid?

No response

Copy link

@PSandro, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

PSandro added a commit to PSandro/selenium that referenced this issue Jan 29, 2025
There are scenarios where a stop() is called on a Service object
multiple times. This also happens when explicitly quitting a Webdriver using
driver.quit() and afterwards having the garbage collector destroy the service
object, calling stop() another time even though the service process
has already terminated.

The check inside the stop() call only ensured that the process variable
is not None, but ignored the fact that the process might already have
terminated. Therefor an additional check is introduced to only send
the remote shutdown command if the service process has not ended.

Fixes SeleniumHQ#15182

Signed-off-by: Sandro Pischinger <[email protected]>
@PSandro PSandro linked a pull request Jan 29, 2025 that will close this issue
8 tasks
@navin772
Copy link
Member

navin772 commented Feb 9, 2025

Hi @PSandro, I am not able to reproduce this on a Linux machine (openSUSE Tumbleweed), can you try on a different distro and confirm this issue?

Here are the logs that I get:

DEBUG:selenium.webdriver.common.selenium_manager:Selenium Manager binary found at: /mnt/data/miniconda3/envs/selenium/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG:selenium.webdriver.common.selenium_manager:Executing process: /mnt/data/miniconda3/envs/selenium/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager --browser firefox --debug --language-binding python --output json
DEBUG:selenium.webdriver.common.selenium_manager:geckodriver not found in PATH
DEBUG:selenium.webdriver.common.selenium_manager:firefox detected at /usr/bin/firefox
DEBUG:selenium.webdriver.common.selenium_manager:Running command: /usr/bin/firefox -v
DEBUG:selenium.webdriver.common.selenium_manager:Output: "Mozilla Firefox 135.0"
DEBUG:selenium.webdriver.common.selenium_manager:Detected browser: firefox 135.0
DEBUG:selenium.webdriver.common.selenium_manager:Required driver: geckodriver 0.35.0
DEBUG:selenium.webdriver.common.selenium_manager:geckodriver 0.35.0 already in the cache
DEBUG:selenium.webdriver.common.selenium_manager:Driver path: /home/navin/.cache/selenium/geckodriver/linux64/0.35.0/geckodriver
DEBUG:selenium.webdriver.common.selenium_manager:Browser path: /usr/bin/firefox
DEBUG:selenium.webdriver.common.service:Started executable: `/home/navin/.cache/selenium/geckodriver/linux64/0.35.0/geckodriver` in a child process with pid: 142323 using 0 to output -3
DEBUG:selenium.webdriver.remote.remote_connection:POST http://localhost:36709/session {'capabilities': {'firstMatch': [{}], 'alwaysMatch': {'browserName': 'firefox', 'acceptInsecureCerts': True, 'moz:debuggerAddress': True, 'pageLoadStrategy': <PageLoadStrategy.normal: 'normal'>, 'browserVersion': None, 'moz:firefoxOptions': {'binary': '/usr/bin/firefox', 'prefs': {'remote.active-protocols': 3}}}}}
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:36709
DEBUG:urllib3.connectionpool:http://localhost:36709 "POST /session HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":{"sessionId":"5c2155ca-76a2-42e6-b7f5-9559c71eed60","capabilities":{"acceptInsecureCerts":true,"browserName":"firefox","browserVersion":"135.0","moz:accessibilityChecks":false,"moz:buildID":"20250130195129","moz:debuggerAddress":"127.0.0.1:58589","moz:geckodriverVersion":"0.35.0","moz:headless":false,"moz:platformVersion":"6.13.1-1-default","moz:processID":142326,"moz:profile":"/tmp/rust_mozprofileWONLzh","moz:shutdownTimeout":60000,"moz:webdriverClick":true,"moz:windowless":false,"pageLoadStrategy":"normal","platformName":"linux","proxy":{},"setWindowRect":true,"strictFileInteractability":false,"timeouts":{"implicit":0,"pageLoad":300000,"script":30000},"unhandledPromptBehavior":"dismiss and notify","userAgent":"Mozilla/5.0 (X11; Linux x86_64; rv:135.0) Gecko/20100101 Firefox/135.0"}}} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '805', 'date': 'Sun, 09 Feb 2025 17:26:37 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
DEBUG:selenium.webdriver.remote.remote_connection:POST http://localhost:36709/session/5c2155ca-76a2-42e6-b7f5-9559c71eed60/url {'url': 'http://www.python.org'}
DEBUG:urllib3.connectionpool:http://localhost:36709 "POST /session/5c2155ca-76a2-42e6-b7f5-9559c71eed60/url HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":null} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '14', 'date': 'Sun, 09 Feb 2025 17:26:42 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
DEBUG:selenium.webdriver.remote.remote_connection:DELETE http://localhost:36709/session/5c2155ca-76a2-42e6-b7f5-9559c71eed60 {}
DEBUG:urllib3.connectionpool:http://localhost:36709 "DELETE /session/5c2155ca-76a2-42e6-b7f5-9559c71eed60 HTTP/1.1" 200 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":null} | headers=HTTPHeaderDict({'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache', 'content-length': '14', 'date': 'Sun, 09 Feb 2025 17:26:44 GMT'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
finish
python issue_py_15182.py  10.66s user 2.69s system 158% cpu 8.407 total

@PSandro
Copy link
Author

PSandro commented Feb 9, 2025

Thanks, I tried it on a different machine (also ArchLinux) where I could also not reproduce this issue. So the environment of my first machine probably has an influence on the appearance of this bug.
When looking for differences between those two machines, I noticed that I have IPv6 disabled on the loopback interface. (net.ipv6.conf.lo.disable_ipv6 = 1).

How can IPv6 possibly have something to do with that?
Well, under the hood request.urlopen(f"{self.service_url}/shutdown") initiates the following socket.create_connection(('localhost', 55779)) (where the port could be different). localhost usually gets resolved to both 127.0.0.1 and ::1. And since the loopback device has IPv6 disabled on my machine, trying to initiate a connect results in a timeout - which seems to be around two minutes for my system.

You could try to reproduce this with the following, without having anything running on port 55779 locally:

sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
strace python3 -c "import socket; socket.create_connection(('localhost', 55779))"

These are the last lines of the output before it gets stuck:

socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(55779), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
socket(AF_INET6, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(55779), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28

We can see that connect for IPv4 returns and instant error ECONNREFUSED whereas connect for IPv6 is stuck in a two minute timeout. After ~ 2 minutes, the output continues, and the result of the previously stuck connect is the following:

connect(3, {sa_family=AF_INET6, sin6_port=htons(55779), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28(3, {sa_family=AF_INET6, sin6_port=htons(51799), sin6_flowinfo=htonl(1795162112), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=8912896}, 28) = -1 ETIMEDOUT (Connection timed out)

I could reproduce this on another RockyLinux machine, but to my surprise not on my second ArchLinux machine (where I don't have any global inet6 address assigned).
So I'm not quite sure yet what exact environment is necessary for the unwanted behavior to appear, but one possible workaround for the case when it happens would be to set the socket timeout to a specific (low) value or make use of the additional termination check as in 15183 or both

@navin772
Copy link
Member

@PSandro thanks for the detailed break down of the issue, but I am still not able to reproduce the error even with the provided minimal code:

sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
strace python3 -c "import socket; socket.create_connection(('localhost', 55779))"

The above never gets stuck even when IPv6 is disabled.

@PSandro
Copy link
Author

PSandro commented Feb 10, 2025

Could you please try to reproduce it with this docker environment?

.
├── compose.yml
└── Dockerfile

Dockerfile

FROM ubuntu:22.04

RUN apt-get update && \
    apt-get install -y python3 strace && \
    apt-get clean

ENTRYPOINT ["/bin/sh", "-c", "echo '::1 localhost' >>/etc/hosts && strace python3 -c \"import socket; print('start'); socket.create_connection(('localhost', 55779))\""]

compose.yml

services:
  ubuntu:
    build: .
    stdin_open: true
    tty: true
    sysctls:
      - net.ipv6.conf.lo.disable_ipv6=1
    networks:
      - ip6net
networks:
  ip6net:
    enable_ipv6: true
    ipam:
      config:
        - subnet: 2001:db8::/64

and run with docker compose up --build

The output will be an strace of the python3 socket connect. Only difference is that docker seems to print only lines ending with a newline feed. So you should be able to observe the output get stuck at:

[...]
ubuntu-1  | socket(AF_INET6, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
ubuntu-1  | connect(3, {sa_family=AF_INET6, sin6_port=htons(55779), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
ubuntu-1  | getsockname(3, {sa_family=AF_INET6, sin6_port=htons(40402), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2001:db8::2", &sin6_addr), sin6_scope_id=0}, [28]) = 0
ubuntu-1  | close(3)                                = 0
ubuntu-1  | socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 3
ubuntu-1  | connect(3, {sa_family=AF_INET, sin_port=htons(55779), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection refused)
ubuntu-1  | close(3)                                = 0
ubuntu-1  | socket(AF_INET6, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 3

If you're able to reproduce this then it would also mean the original request.urlopen(f"{self.service_url}/shutdown") would hang, blocking the selenium driver to quit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants