Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hang on long commands that should time out #76

Open
heckj opened this issue Jan 1, 2025 · 2 comments
Open

hang on long commands that should time out #76

heckj opened this issue Jan 1, 2025 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@heckj
Copy link
Owner

heckj commented Jan 1, 2025

Describe the issue
When running some of the longer commands that should be timing out, I'm seeing a hang where the timeout appears to have fired, but because the deep-down process is waiting on a blocking call, it gets wrapped around the axle and never returns, effectively hanging/deadlocking the setup.

I might just have the timeout logic done badly, but it doesn't appear so from a testing perspective.

I'm noticing this while running "apt-get update && apt-get upgrade" with a default timeout against a fresh remote host. It's taking longer than the default ShellCommand timeout. Watching on the host, the command clearly completes - but the thread is hung up on the waitUntilProcessExits call, leaving the whole thing kind of screwed.

I could drop/disable the shell execution timeout, but I think that's a mistake - so I likely need to do something else to commands - specifically remote ones - that could take a while - many seconds, or even minutes.

One option is Citadel for SSH access - or SwiftCommand or the variations there. Needs further exploration.

Workaround appears to be to set a ludicrous timeout.

@heckj heckj added the bug Something isn't working label Jan 1, 2025
@heckj heckj self-assigned this Jan 1, 2025
@heckj
Copy link
Owner Author

heckj commented Jan 1, 2025

The core failure path is when my structured timeout expires while the Process.task.waitUntilComplete() is still kicking. In that case that call never seems to return. Reading through the proposed open-source variant of Process (Subprocess) was illuminating - in the draft PR. I suspect Franz's comment is relevant, since I was calling these processes from a concurrent thread pool.

That leads to a possible solution - shift engine.run(host:command:) to run on it's own thread, right now I think that's only clearly possible with an Actor with an explicit serial executor, but he also mentions an upcoming feature of withTaskExecutor for pinning an async task to a specific thread.

I had hoped to be able to run multiple process executions concurrently (at least for each host being interacted with), but while using Process and forking out to a CLI ssh, I suspect that's not possible due to the syscall constraint. That said, switching over to network calls directly, (for example, using Citadel) I think could allow me to at least make the host calls concurrently.

@heckj
Copy link
Owner Author

heckj commented Jan 1, 2025

There's a bit more detail in Swift 6's Task Executor Preference (SE04017), a section on handling blocking code. In reading through that in detail, my original concept of wanting to run this in parallel - while the code relies on a blocking call - was probably ill-conceived. While there's likely sufficient threads for a a few hosts in parallel, the pathological case (100's in parallel) could quite possibly thread-pool starve the machine. Switching all Process requests to a dedicated thread and executor, and running them serially, would be a LOT more sensible.

That said, I'm not certain if that would allow for a timeout and cancellation with the existing Process, since it IS a blocking call - and out of sight since it's embedded in the core foundation libraries - all of which argues even more strongly to consider reading for Citadel, since most of the functionality here is explicitly meant to be provided agent-less over an SSH channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant