Skip to content

Deadlock in userspace tunnel due to the connection closing on one part. #502

Open
@stoktamisoglu

Description

@stoktamisoglu

Describe the bug
When there is high throughput like running a loop to launch/kill app and at the same time getting syslog with another ios syslog process then there is a high chance that the userspace tunnel will be deadlocked or get into circular waiting lock situation.

To Reproduce
Steps to reproduce the behavior:

  1. Start a userspace tunnel with ios tunnel start --userspace within a console/terminal
  2. Run a script like the below on Windows Powershell (a Linux script can be also used to reproduce)
for ($i = 1; $i -le 500; $i++) {

    Write-Host "Iteration ${i}: Launching"
    C:\dev\goios146\ios.exe launch com.apple.mobilesafari -v -t # com.apple.Maps
    
    Start-Sleep -Seconds 0.5

    Write-Host "Iteration ${i}: Killing"
    C:\dev\goios146\ios.exe kill com.apple.mobilesafari -v -t
    Start-Sleep -Seconds 0.5
}
  1. When the script runs, open a new console/terminal and run

ios syslog > syslog.txt

The script running in the loop hangs and the tunnel also stalls or totally hangs.

As it is a race condition issue, you may need to stop syslog with Ctrl+C and start again perhaps a few times.

There are some other ways to reproduce which can be provided upon a request.

Expected behavior
Tunnel should continue working and the script loop along with syslog shouldn't be stalled.

Desktop (please complete the following information):

  • OS: Windows, Linux

Smartphone (please complete the following information):

  • Device: iPhone XR
  • OS: iOS18.1

Additional context
I narrowed down the problem to the functions :

func proxyConns(rw1 io.ReadWriter, rw2 io.ReadWriter) error {
	err1 := make(chan error)
	err2 := make(chan error)
	go ioCopyWithErr(rw1, rw2, err1)
	go ioCopyWithErr(rw2, rw1, err2)
	return errors.Join(<-err1, <-err2)
}

func ioCopyWithErr(w io.Writer, r io.Reader, errCh chan error) {
	_, err := io.Copy(w, r)
	errCh <- err
}

With some debugging I observe that the connection is getting closed after the one of the io.Close in ioCopyWithErr then the second io.Copy is not aware the one part of the connection is closed and waiting on its
_, err := io.Copy(w, r)
Because it doesn't reach to send to unbuffered channel
errCh <- err
in this second ioCopyWithErr.

Because the error channel used is unbuffered channel so it is blocking until receiver in proxyConns

return errors.Join(<-err1, <-err2)
is ready and the sender in ioCopyWithErr
errCh <- err
should also be ready with unbuffered channels.

Because of the second io.Close stuck due to connection closed on one part. proxyConns wait to receive from second error channel indefinitely.

I worked on PR #503 to propose the fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions