-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Swift Subprocess #439
base: main
Are you sure you want to change the base?
Conversation
@swift-ci please test |
bb3c1a8
to
93fb0b9
Compare
@swift-ci please test |
var result = posix_spawn_file_actions_adddup2(&fileActions, input.getReadFileDescriptor().rawValue, 0) | ||
guard result == 0 else { | ||
try self.cleanupAll(input: input, output: output, error: error) | ||
throw POSIXError(.init(rawValue: result) ?? .ENODEV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this initializer fails, it means that posix_spawn_file_actions_addup2
returned a value that isn't an error code, which is documented to not be possible (aside from the success case). Should we force unwrap here instead of silently swapping out .ENODEV
? Or if not, is ENODEV
the right choice here? (this applies to all of the similar nil coalescing below)
spawnAttributeError = posix_spawnattr_set_qos_class_np(&spawnAttributes, QOS_CLASS_UTILITY) | ||
} else if spawnAttributeError == 0 && self.platformOptions.qualityOfService == .background { | ||
spawnAttributeError = posix_spawnattr_set_qos_class_np(&spawnAttributes, QOS_CLASS_BACKGROUND) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do something in the else
case to indicate a developer error by providing anything else as the qualityOfService
so that we don't silently drop the value?
// Setup cwd | ||
var chdirError: Int32 = 0 | ||
if intendedWorkingDir != .currentWorkingDirectory { | ||
chdirError = intendedWorkingDir.withPlatformString { workDir in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently SwiftSystem's withPlatformString
behaves differently than Foundation's withFileRepresentation
(namely that it doesn't perform normalization/decomposition on Darwin). Do we want to use System's implementation here, or do we want to be consistent with code like Task
/FileManager
and use Foundation's here? (this goes for all uses of withPlatformString
)
} | ||
|
||
if chdirError != 0 { | ||
throw CocoaError(.fileNoSuchFile, userInfo: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure fileNoSuchFile
is appropriate here since the directory may exist but, for example, might not be permitted for the current user. Should we instead use the CocoaError
static functions that create a cocoa error based on chdirError
?
} | ||
// Spawn | ||
var pid: pid_t = 0 | ||
let spawnError: CInt = executablePath.withCString { exePath in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that this withCString
should really be withFileSystemRepresentation
because it is a path
} | ||
|
||
extension Optional where Wrapped == String { | ||
func withOptionalCString<R>(_ body: ((UnsafePointer<Int8>)?) throws -> R) rethrows -> R { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a meaningful difference between optString.withOptionalCString
and optString?.withCString
at the call sites?
} | ||
|
||
// MARK: - Stubs for the one from Foundation | ||
public enum QualityOfService: Int, Sendable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be inside of a #if !FOUNDATION_FRAMEWORK
block?
// MARK: - Private Helpers | ||
extension FileDescriptor { | ||
internal func read(upToLength maxLength: Int) throws -> [UInt8] { | ||
let buffer: UnsafeMutableBufferPointer<UInt8> = .allocate(capacity: maxLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This buffer is never deallocated - is there a way to do this a bit safer without allocating the buffer ourself? Also, should this return a Data
instead of a [UInt8]
|
||
extension Subprocess.Result: Hashable where T : Hashable {} | ||
|
||
extension POSIXError : Swift.Error {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not already defined elsewhere in the package?
|
||
let result = try await Subprocess.run( | ||
executing: .named("curl"), | ||
arguments: ["http://ip.jsontest.com/"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok for our test to rely on this external address, or should we use something like apple.com
?
//===----------------------------------------------------------------------===// | ||
|
||
@available(macOS 12.0, iOS 15.0, tvOS 15.0, watchOS 8.0, *) | ||
public struct AsyncLineSequence<Base: AsyncSequence>: AsyncSequence where Base.Element == UInt8 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate this out into its own PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already in Foundation today. I'm simply moving it to SwiftFoundation.
// The output has been written somewhere else | ||
return nil | ||
case .collected(_, let readFd, _): | ||
return AsyncBytes(fileDescriptor: readFd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems dangerous since we might be sharing the readFD with multiple AsyncBytes
here. We should ensure that we are only ever returning one AsyncByte
s here and either return nil
or better fatalError
on subsequent access to standardOutput
or standardInput
} | ||
} | ||
|
||
internal func captureStandardOutput() throws -> Data? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the method below need to be async since the read might be blocking.
|
||
// MARK: - StandardInputWriter | ||
extension Subprocess { | ||
internal actor StandardInputWriterActor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use an actor for this here. Moreover, the write
and and finish
methods below are potentially blocking so we need to make them run on an appropriate executor.
|
||
// MARK: - Result | ||
extension Subprocess { | ||
public struct Result<T: Sendable>: Sendable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT:
public struct Result<T: Sendable>: Sendable { | |
public struct ExecutionResult<T: Sendable>: Sendable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it doesn't seem clear to me here that T
must be Sendable
. It seems more to me that this should be a conditional Sendable
conformance
// MARK: - Signals | ||
extension Subprocess { | ||
public struct Signal : Hashable, Sendable { | ||
public let rawValue: Int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to enforce Int32
here or just make that an implementation detail? For most of our public APIs we prefer Int
over specific IntX
types.
internal func run<R>( | ||
output: RedirectedOutputMethod, | ||
error: RedirectedOutputMethod, | ||
_ body: @Sendable @escaping (Subprocess, StandardInputWriter) async throws -> R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The body
really shouldn't require to be @Sendable
nor @escaping
. This will limit the way how this is used and make it very hard to use from within an actor.
parentSide: false, | ||
attemptToTerminateSubProcess: false | ||
) | ||
return try await withTaskCancellationHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is not the right way to do task cancellation here. I personally think we should check for cancellation only when we do a read
or write
syscall. This makes sure that user code in the body
still continues to execute and only when they try to make a sys call we check for it.
return try await withTaskCancellationHandler { | ||
return try await withThrowingTaskGroup(of: RunState<R>.self) { group in | ||
group.addTask { | ||
let status = monitorProcessTermination( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method appears to be blocking so must not call this from a concurrency thread. We need to use a custom task executor here to watch the child task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on this? Are you saying all blocking methods need to be in a separate executor? That would make Swift's concurrency system really hard to use...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we discussed an alternative solution for this offline and we should use kqueue,epoll,io_uring to monitor to process instead. However, to answer your general question for anyone that's reading along: Yes you can never make a blocking sys call in neither a sync nor async method really since those might be called from one of the Concurrency threads which will get blocked. Any blocking call must run on its own thread and the canonical spelling starting is going to be withTaskExecutor
once that feature has landed in a released Swift version.
} | ||
group.addTask { | ||
do { | ||
let result = try await body(process, .init(fileDescriptor: writeFd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really should aim to not execute this in a child task but keep it in the parent task. Otherwise this becomes very hard to use in some places. The only reason that I can see why we run this in a child task is because we want to cancel the body
if the child process terminates. I think that we shouldn't do that but rather let the error that the child process terminates bubble up from the read/next
and write
APIs. From our experience cancelling user code that might do arbitrary things is not good and can lead to hard to implement patterns.
Imagine the following, a user spawns a subprocess and inside the body
closure they do an outbound network request where they stream the data that the subprocess produces over the network. If we cancel their body
closure once the process terminates it becomes impossible for them to send any more data over the network since their task is now cancelled. What they rather want to have is get notified of the termination of the process via the next()
call on the iterator and then be able to handle this by e.g. sending something over the network to indicate the termination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we cancel their body closure
We are not currently canceling the body closure when the subprocess finishes. They run on separate tasks and run
waits for both to finish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right this is very subtle since the monitor task is not throwing. My other point still stands that we should avoid running the users body
closure in a child task here since it forces the closure to become @Sendable
and @escaping
which makes it hard to use in some context e.g. inside an actor.
We should run the closure in the calling task and the monitoring of the child process can happen in a child task but in the end it must surface through the read/write
operations of the Subprocess
. From what I can tell this is already happening and you can just move the call to body
to the body of the withTaskGroup
here.
E2 80 A9: U+2029 (PARAGRAPH SEPARATOR) | ||
*/ | ||
let _CR: UInt8 = 0x0D | ||
let _LF: UInt8 = 0x0A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write let _LF = UInt8(ascii: "\n")
or even remove these constants all together, there's no perf benefit of doing this.
switch self.storage { | ||
case .executable(let executableName): | ||
// If the executableName in is already a full path, return it directly | ||
if Subprocess.Configuration.pathAccessible(executableName, mode: X_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is racy and can block
Sources/_CShims/process_shims.c
Outdated
} | ||
|
||
// Spawn | ||
return posix_spawn(pid, exec_path, file_actions, spawn_attrs, args, env); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this fails with POSIX_SPAWN_SETEXEC
, how's the error communicated to the parent?
Sources/_CShims/process_shims.c
Outdated
|
||
// Finally, exec | ||
execve(exec_path, args, env); | ||
// If we got here, something went wrong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, how's the error code communicated back to the parent? I'd expect a pipe between parent and child here that has O_CLOEXEC. So if the parent reads something there, then that's the child's exit code. If the parent reads EOF, then everything's okay.
Here's code I've written in a previous life taking care of this: https://github.com/BromiumInc/BromiumCoreUtils/blob/2ec1c52a09831893618b9620383015e4362f7aa5/BromiumCoreUtils/BRUTask.m#L338-L341
Sources/_CShims/process_shims.c
Outdated
if (file_descriptors[4] != 0) { | ||
rc = close(file_descriptors[5]); | ||
if (rc != 0) { return rc; } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this fails to
- close the other file descriptors
- reset the signal masks
final class SubprocessTests: XCTestCase { | ||
func testSimple() async throws { | ||
let ls = try await Subprocess.run(executing: .named("ls"), output: .collect, error: .discard) | ||
let result = String(data: ls.standardOutput!, encoding: .utf8)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let result = String(data: ls.standardOutput!, encoding: .utf8)! | |
let result = String(decoding: ls.standardOutput!, as: UTF8.self) |
Sources/_CShims/process_shims.c
Outdated
} | ||
|
||
// Spawn | ||
return posix_spawn(pid, exec_path, file_actions, spawn_attrs, args, env); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While posix_spawn
is fine in some circumstances, it is not always ideal. In some cases one needs to replace the current process instead of launching a new one, with execv
on Linux/macOS and other means on Windows. For example, the swift run
command in SwiftPM works exactly this way. Is there anything in the current Subprocess
API that would allow that? If not, is there anything in the API that would prevent us from adding it in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxDesiatov I think POSIX_SPAWN_SETEXEC
makes posix_spawn
essentially execve
on steroids.
93fb0b9
to
f001ac6
Compare
internal func createExecutionInput() throws -> ExecutionInput { | ||
switch self.method { | ||
case .noInput: | ||
let devnull: FileDescriptor = try .open("/dev/null", .readOnly) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be using NUL on Windows? Repeats throughout.
This is the implementation accompanying #397